Tuesday, September 14, 2021

CST 311 Intro to Computer Networks Module 2: Application Layer

0.0 Learning Outcomes for Module 2

  • This week we are learning about the Application layer and go in depth about learning the different protocols for Web, HTTP, Email(SMTP, POP3, IMAP), DNS, P2P, and video streaming/content distribution. Here are the concepts to keep in mind while learning throughout the week.



1.0 Learn the basic principles of network applications. 

  • Network Application: At the core of network application development is writing programs that run on different end systems and communicate with each other over the network. For example, in the Web application there are two distinct programs that communicate with each other: the browser program running in the user’s host (desktop, laptop, tablet, smartphone, and so on); and the Web server program running in the Web server host.
  • Application Architecture: Designed by the application developer and dictates how the application is structured over the various end systems.
    • Client-Server: In a client-server architecture, there is an always-on host, called the server, which services requests from many other hosts, called clients
      • Web application for which an always-on Web server services requests from browsers running on client host.
      • Data centers: Houses a large number of hosts, is often used to create a powerful virtual server because 
        a single-server host is incapable of keeping up with all the requests from clients.
      • The server has a fixed, well-known address, called an IP address.. Because the server has a fixed address, and because the server is always on, a client can always contact the server by sending a packet to the server’s IP address. 
        Some of the better-known applications with a client-server architecture include the Web, FTP, Telnet, and e-mail.
    • Peer-to-Peer: In a P2P architecture, there is minimal (or no) reliance on dedicated servers in data centers. Instead the application exploits direct communication between pairs of intermittently connected hosts, called peers
      • These applications include file sharing (e.g., BitTorrent), peer-assisted download acceleration (e.g., Xunlei), and Internet telephony and video conference (e.g., Skype). 
      • Pros
        • P2P architectures is also self-scalable. because in a P2P file-sharing application, although each peer generates workload by requesting files, each peer also adds service capacity to the system by distributing files to other peers.  
        • P2P architectures are also cost effective because they normally don’t require significant server infrastructure and server bandwidth.
      • Cons
        • Security, performance, reliablity issues: P2P applications face challenges of security, performance, and reliability due to their highly decentralized structure. 
  •  Process Communicating
    • Processes: This is what communicates with each other. A process can be thought of as a program that is running with an end system. We want to know how processes running on different hosts (with possibly different OS) communicate. 
    • Messages: Processes communicate by sending and receiving messages to one another. 
    • Client Server Processes: With communicating processes, we label one of the two processes a client and the other processes as the server. 
      • Client: Process that initiates communication.
      • Server: Process that waits to be contacted.
    • Socket: A software interface that allows messages to be sent and received. Analogous to a door in a house that let's people in and out.
    •   
    • API(Application Programming Interface): A socket is the INTERFACE between the application and transport layer which is also referred to as the API between the application and the network.
    • Addressing Processs: To send messages between a server and a client, you must know the IP address and port number.
      • IP Address: This is a 32-bit data that uniquely identifies a Host.
      • Port Number: Different ports serve as different receiving ports and each application has a dedicated port number such as Web server has a port number of 80.
    • Transport services for applications
      • Reliable data transfer: Provides applications a guarantee that the data will be delivered without loss.
        • Loss-tolerant applications that aren't too effected by slight data loss include video and audio streaming.
      • Throughput: The rate at which a sending process can deliver bits to the receiving process. Throughput can fluctuate over time depending on how much bandwidth the network is using. This service guarantees that before a message is sent, the message will be sent at a guaranteed throughput of r bits/sec.
        • Bandwidth sensitive applications are applications that need a throughput requirement.
        • Elastic applications can make use of whatever throughput is available which include e-mail, file transfer, and web transfer.
      • Timing Guarantees that applications will arrive after a certain time period such as 100 msec. Some apps that benefit from timing services include video conferencing, and multiplayer games. 
      • Security This service can provide encryption to the data being sent by the client which gives confidentiality between processes. This service can also provide data integrity and end port authentication.
    • Transport Services Provided by the Internet The internet (TCP/IP networks) provide two different transport protocols that are available for applications: UDP and TCP.
      • TCP Services 
        • Includes 1). connection-oriented service and 2). reliable data transfer service.
          • Connection Oriented Service: Before messages are sent between client and server, TCP has the client exchange transport layer control info (handshake). A TCP connection now exists between the sockets and client-server is ready to send and receive info. After messages are done exchanging, the connection must close. 
          • Reliable data transfer service: TCP sends data without error and in the proper order including no missing or duplicate bytes. 
          • Congestion control mechanism: This service throttles sending processes when the network is cnogested. 
        • Secure Socket Layer(SSL) Is an enhancement of TCP that provides additional security services such as encryption, data integrity, and end-point authentication.
      •  UDP Services
        • Provides a no-frills, lightweight transport protocol that is minimal in service. 
        • Characteristics
          • Connection-less which means no handshaking occurs before messages are sent. 
          • Unreliable: No guarantees that messages sent will be received and if they are received, may be out of order. 
          • No congestion control mechanism: Can dump data at any rate without considering the bandwidth usage on the network.

1.1 Learn the basics of three client-server applications protocols: HTTP, SMTP, and DNS 

  •  HTTP: This is the Web's application layer protocol.
    • Characteristics
      • Client-Server Model
        • Client: Browser that requests, receives, and displays web objects.
        • Server: Web server sends (using HTTP protocol) objects in response to request.
      • Uses TCP Service
        • Client initiates TCP connection, creates socket to server, port 80.
        • Server accepts TCP connection from client.
        • HTTP messages (application-layer protocol message) exchanged between browser (client) and Web server (server). 
        • TCP connection is closed. 
      • HTTP is stateless
        • Server maintains no information about past client requests.
        • Maintaining state is complex so if server/client crashes, state may be inconsistent.
      •  HTTP Connections (Can be both persistent or non persistent)
        • Non-Persistent HTTP
          • At most one object sent over TCP connection. 
          • Connection then closed. 
          • Downloading multiple objects required multiple connections.
          • Response time = 2RTT + file transmission time
        •  Persistent HTTP
          • Multiple objects can be sent over a SINGLE TCP connection between client and server.
          • Response time = 1 RTT
      • HTTP Request Messag
        • HTTP Request
          • ASCII (Human-readable format)
        • Uploading Form Input
          • Post Method
            • Web page often includes form input.
            • Input is uploaded to server in entity body.
          • URL Method
            • Uses GET method.
            • Input is uploaded in URL field of request line.
      •  HTTP Response Message
        • Includes Status line (protocol status code, status phrase), header lines, data
        • Status codes: 
          • 200 OK
          • 301 Moved permanently
          • 400 Bad Request
          • 401 Not found
          • 505 HTTP version not supported
      •  Cookies allow user states to be saved.
        • Components (4)
          • Cookie header line of HTTP response message
          • Cookie header line in next HTTP request message
          • Cookie file kept on user's host, managed by user's browser
          • Back-end database at web site
        • Uses include authorization, shopping carts (keeping them intact), recommendations, and saving user session states.
      •  Web Cache (Proxy Server) satisfies client request without involving original server. 
        • Acts as both client and a server.
        • Pros include reduces response time for client request, reduce traffic on an institution's access link.
  • Simple Mail Transfer Protocol (SMTP) - SMTP transfers messages from senders’ mail servers to the recipients’ mail servers. 
    • SMTP Characteristics
      • SMTP is used to send and receive email. It is sometimes paired with IMAP or POP3 (for example, by a user-level application), which handles the retrieval of messages, while SMTP primarily sends messages to a server for forwarding.
      • SMTP can both send and receive mail, but it's bad at queuing incoming messages, hence the common delegation to other protocols. 
      • Proprietary systems like Gmail have their own mail transfer protocols when using their own servers, but they still use good old SMTP to email beyond that.  
      • SMTP is an asymmetrical protocol, meaning that there are many clients interacting with one server, using a basic model popular in the 1980s which is now mostly defunct outside of email protocols. SMTP runs on TCP/IP and listens on port 25. 
    • Comparison with HTTP 
      • 1. HTTP transfers files (also called objects) from a Web server to a Web client (typically a browser); SMTP transfers files (that is, e-mail messages) from one mail server to another mail server. 
        • Pull Protocol: Someone loads information on a Web server and users use HTTP to pull the information from the server at their convenience. In particular, the TCP connection is initiated by the machine that wants to receive the file. 
        • Push Protocol: SMTP is primarily a push protocol—the sending mail server pushes the file to the receiving mail server. In particular, the TCP connection is initiated by the machine that wants to send the file.  
      • 2. SMTP requires each message, including the body of each message, to be in 7-bit ASCII format.    
      • 3. Concerns how a document consisting of text and images (along with possibly other media types) is handled. As we learned in Section 2.2, HTTP encapsulates each object in its own HTTP response message. SMTP places all of the message’s objects into one message.
    •  Mail Access Protocols - tranfers email messages from mail server to local pc.
      • POP3 
        • Process POP3 begins when the user agent (the client) opens a TCP connection to the mail server (the server) on port 110. With the TCP connection established, POP3 progresses through three phases: authorization, transaction, and update.  
          • 1. Authorization: The user agent sends a username and a password (in the clear) to authenticate the user. 
          • 2. Transaction: The user agent retrieves messages; also during this phase, the user agent can mark messages for deletion, remove deletion marks, and obtain mail statistics. 
          • 3. Update Occurs after the client has issued the quit command, ending the POP3 session; at this time, the mail server deletes the messages that were marked for deletion.  
        •  Possible Replies From Server 
          • +OK (sometimes followed by server-to-client data), used by the server to indicate that the previous command was fine.
          • -ERR, used by the server to indicate that something was wrong with the previous command.
        • Problems with POP3
          • POP3 protocol does not provide any means for a user to create remote folders and assign messages to folders (IMAP does).
        • Authorization Commands
          • user <username>
          • pass <password>.  
      •  IMAP 
        • Description 
          • An IMAP server will associate each message with a folder; when a message first arrives at the server, it is associated with the recipient’s INBOX folder. The recipient can then move the message into a new, user-created folder, read the message, delete the message, and so on.
        • Capabilities
          • The IMAP protocol provides commands to allow users to create folders and move messages from one folder to another. I 
          • Provides commands that allow users to search remote folders for messages matching specific criteria.   
          • Has commands that permit a users to obtain components of messages 
            • For example, a user agent can obtain just the message header of a message or just one part of a multipart MIME message. 
            • This is useful when there is a low-bandwidth connection between the user agent and its mail server.
      •  Web-Based Email 
        • With this service, the user agent is an ordinary Web browser, and the user communicates with its remote mailbox via HTTP.  
        • When a recipient wants to access a message in his mailbox, the e-mail message is sent from the mail server to the browser using the HTTP protocol rather than the POP3 or IMAP protocol.
        • When a sender wants to send an e-mail, the e-mail is sent from the browser to the mail server over HTTP rather than over SMTP. The sender's mail server, however, still sends messages to, and receives messages from, other mail servers using SMTP.  
  •  DNS - The Internet's Directory Service
    •  Services Provided By DNS
      • Translates hostnames to IP addresses.  
        • This is the main task of the Internet’s domain name system (DNS)
        • The DNS is (1) a distributed database implemented in a hierarchy of DNS servers, and 
        • (2) an application-layer protocol that allows hosts to query the distributed database. 
        • The DNS servers are often UNIX machines running the Berkeley Internet Name Domain (BIND) software [BIND 2016]. 
        • The DNS protocol runs over UDP and uses port 53.
      • Host aliasing. A host with a complicated hostname can have one or more alias names. 
        • For example, a hostname such as relay1.west-coast .enterprise.com could have, say, two aliases such as enterprise.com and www.enterprise.com. 
        • In this case, the hostname relay1 .west-coast.enterprise.com is said to be a canonical hostname. Alias hostnames, when present, are typically more mnemonic than canonical hostnames. 
        • DNS can be invoked by an application to obtain the canonical hostname for a supplied alias hostname as well as the IP address of the host.
      • Mail server aliasing.  
        • DNS can be invoked by a mail application to obtain the canonical hostname for a supplied alias hostname as well as the IP address of the host.
      • Load distribution.  
        • DNS is also used to perform load distribution among replicated servers, such as replicated Web servers. 
        • Busy sites, such as cnn.com, are replicated over multiple servers, with each server running on a different end system and each having a different IP address. 
    •  How it Works: Centralized VS Distributed DNS
      • DNS servers are distributed around the globe instead of having a centralized database containing one DNS server. Problems with this centralized design includes:
        • 1. A Single point of failure.
        • 2. Traffic Volume.
        • 3. Distant Centralized Database.
        • 4. Maintenance.
      •  Distributed System 
        •  
        • DNS uses a large number of servers, organized in a hierarchical fashion and distributed around the world. No single DNS server has all of the mappings for all of the hosts in the Internet. Instead, the mappings are distributed across the DNS servers. 
        • To understand how these three classes of servers interact, suppose a DNS client wants to determine the IP address for the hostname www.amazon.com. 
          • The client first contacts one of the root servers, which returns IP addresses for TLD servers for the top-level domain com. 
          • The client then contacts one of these TLD servers, which returns the IP address of an authoritative server for amazon.com. 
          • Finally, the client contacts one of the authoritative servers for amazon.com, which returns the IP address for the hostname www.amazon.com. 
        •  Types of Servers
          • Root DNS Server
            • Root name servers provide the IP addresses of the TLD Servers.
          • Top-Level Domain Server (TLD)
            • Top-level domains such as com, org, net, edu, and gov, and all of the country top-level domains such as uk, fr, ca, and jp — there is TLD server (or server cluster).
          • Authoritative DNS Servers 
            • Every organization with publicly accessible hosts (such as Web servers and mail servers) on the Internet must provide publicly accessible DNS records that map the names of those hosts to IP addresses. An organization’s authoritative DNS server houses these DNS records.
          • Local DNS Server
    •  DNS Caching
      • DNS extensively exploits DNS caching in order to improve the delay performance and to reduce the number of DNS messages ricocheting around the Internet.  
      • The idea behind DNS caching is very simple. In a query chain, when a DNS server receives a DNS reply (containing, for example, a mapping from a hostname to an IP address), it can cache the mapping in its local memory.
    •  DNS Records and Messages
      • Resource records (RRs)  
        • The DNS servers that together implement the DNS distributed database store resource records (RRs), including RRs that provide hostname-to-IP address mappings.   
      •  DNS Messages
        • Question Section
        • Answer Section
        • Authority Section
        • Additional Section

1.3 Learn about Peer-to-Peer applications and how they differ from client server. 

  •  Differences
    • No Always-On Server 
      • Any end system can communicate with any end system unlike client-server model.
    •  Changes in IP address
      • Changing IP address when you go to a different network
  •  File Distribution
    •  Much Slower for P2P vs client-server
  •  Pros
    • Privacy and owning your data without the fear of having your data stolen or sold by a third party server.

 1.4 Write simple client-server programs (in Python). 

  • We wrote simple client-Server Programs in python for our team programming assignment to see how sockets, clients and servers all interconnect in sending and receiving messages. 
  • We looked at the differences in code for TCP and UDP processes.

1.5 Lab 2 Examine HTTP traffic using Wireshark.

  • In this lab, worked on several aspects of the HTTP protocol: the basic GET/response interaction, HTTP message formats, retrieving large HTML files, retrieving HTML files with embedded objects, and HTTP authentication and security.

1.6 Lab 3 Use Mininet to emulate three client-server applications: Web, Chat, and DNS. 

Part1: Capture HTTP Packets Between Mininet Hosts. You will use an emulated Mininet host as a client, accessing another Mininet emulated host running a simple web server.
  •  1. Power on the VM, and login on to the console.
  •  2. Logged in on iTerm so that xterm works.
    • ssh -X mininet@localhost -p 2223
  •  3. Verify that you are in contact with internet hosts from mininet VM.
  •  4. Start the mininet emulation with the topology of three hosts connected via one switch. 
    • sudo mn --topo=single,3
  •  5. Spawn a graphical windown on the emulated host h1. 
    • xterm h1
  •  6. Start a simple web server on host h2, running on port 80, with this command.
    • h2 python -m SimpleHTTPServer 80 & 
  •  7. Run Wireshark to capture the packets that go between the hosts h1 & h2 through the switch. Select either of the interfaces on either of the switches for the packet capture. (Your choices of interfaces to select are s1-eth1 and s1-eth2.) Use the following display filter: http or tcp. 
    • s1 wireshark & 
  •  8. Make an HTTP request from h1 to the web server running on h2: 
    • h1 wget h2  
    • The wget command is a command line utility for downloading files from the Internet. It is included in the Mininet VM version of Linux. By default, this command will download the files into the working directory. 
    • Take a screenshot of the Wireshark HTTP trace. Make sure you include the entire “conversation” between h1 and h2, including the TCP handshake (SYN, SYN-ACK, and ACK.) 
    •  
  • 9. Stop the Wireshark packet capture and examine the packets involved in making the request. Look at the encapsulated packets and all header field values. 
  • 10. Clean up after using.
    • sudo mn -c
    • What is the name of the file that h1 downloaded from h2 using the wget command? 
      • the name of the file is just "/" 

Part 2: DNS. You will briefly explore the Domain Name System, using your Mininet VM.
  1. Run the terminal and use the comand nslookup which allows the host to query any specified DNS server for a DNS record. The queried DNS server can be: a root DNS server, a top-level-domain DNS server, an authoritative DNS server, or an intermediate DNS server. Essentially, nslookup sends a DNS query to the specified DNS server, receives a DNS reply, and displays the result. 
  2. Use a second command-line tool called dig (Domain Information Groper) to lookup sequence in more detail. 
  3. If no DNS server is specified, the query will default to the default DNS server which will be a local DNS resolver that is running on the router if you are using a home setup. 
  4. Start
  5. Run nslookup on the server named "gaia".  mininet@mininet-vm:~$ nslookup gaia.cs.umass.edu
    1. The response should have 2 pieces of information.
    2. (1) The name and IP Address of the DNS server that provides the answer.
    3. (2) The answer itself which is the host name and the IP address of the computer above. 


  6. Start Wireshark and set the capture to the NAT adapter that you verified in Part 1 above (eth0 or eth1):  mininet@mininet-vm:~$ sudo wireshark &
  7. Now examine the results when using the wget command. This time the wget command initiates the DNS lookup before the web request: mininet@mininet-vm:~$ wget gaia.cs.umass.edu
  8. Examine the packets involved in making the request. Look at the encapsulated packets and all header field values. You should see six captured packets: the query and response of the DNS lookup for both IPv4 and IPv6 addresses, and then http request and response, as shown below. Select the DNS response packet for IPv4 and expand the response so that you can see the flags, the query and the answer. You should get a result like the screenshot below. Take a screenshot of this Wireshark DNS trace. Make sure you include both the DNS packets and the HTTP packets. Also, make sure you include the expanded packet showing the DNS response to the query about the gaia.cs.umass.edu server. 
    1. What is the destination port for the DNS query message? What is the source port of the DNS response message?  
      1. Src: 53, Dst: 53 
    2. In the lab, to what initial IP address is the DNS query message sent. Now use ipconfig to determine the IP address of  your local DNS server.  
      1. These two IP addresses are the same.
    3. The following are the names and addresses of some of CSUMB DNS servers? (There are two correct answers.) Use nslookup to check the address.
      (1) Name:dnssec1.cenic.org Address: 207.62.80.186 
      (2) Name: ns1.csumb.edu Address: 198.189.5.3
       
  9. Now run the basic nslookup command again. 
    1. nslookup www.csumb.edu  
    2. You can see that a canonical name is associated with that hostname followed by two addresses are given for the canonical name. 
    3. It is seen that CSUMB has transferred its web hosting to a cloud service provider (Terminalfour). 
    4. The www.csumb.edu hostname is actually an alias for the hosted web servers in the cloud.  

  10. You can also use nsl lookup with various options. 
    1. mininet@mininet-vm:~$ nslookup -type=CNAME www.csumb.edu  
    2. This is just querying a record for the canonical name of CSUMB's web server.
    3.  
  11. You can use the nslookup -type=NS which will query for the NS record to the default local DNS resolver. 
    1. nslookup -type=NS google.com 
  12. You can use nslookup -type=MX csumb.edu
    1. This finds the name and address of CSUMB's mail server.
    2. Mail servers are listed with a preference number (lower number = greater priority) for each one incase one server is down.  
    3. What is the name of the CSUMB's preferred mail server?
      1. aspmx.l.google.com

  13. You can run a command called dig does does the same query as nslookup on the Mininet VM. 
    1. mininet@mininet-vm:~$ dig gaia.cs.umass.edu

  14. The +trace option for dig lists each different server the query goes through to its final destination. 
    1. mininet@mininet-vm:~$ dig +trace gaia.cs.umass.edu
       



No comments:

Post a Comment

CST 499 Capstone - Week 8 Learning Journal Final Entry

This is the very last entry of the journal of your CS Online learning!  Keeping regular journals is a great way for us to grow, both profe...