PAD/DP

Lab Page

Authors: Dan Cosma, Vlad Roșculescu-Nemeș

Updated: 19.02.2022

PAD / DP Lab recommends Linux and Java

Dear students,

Welcome to the official web page of the Distributed Programming laboratory!

Part 1: BSD Sockets

The first assignment deals with the implementation of a small TCP application using BSD sockets. The application must consist of a server and at least one client that use the services provided by the server. Teams will propose specifications for the application by sending them by e-mail to the lab supervisor. Examples below.

The deadline for submitting the project is Week 4 of the semester.

The following tasks must be accomplished:

An application-level communication protocol
A server and at least a client that closesly follow the protocol. The programs must be written in C, using BSD sockets on Linux or UNIX
[Optional] The project can be hosted on a public code repository (git or svn), or on the CS Department's Git server

Example Projects (in Romanian)

Part 2: Distributed Application

Requirements

The goal of this project is to build an application consisting of several distributed, interconnected software components that provide a set of functionalities according to the business requirements.

The deadline for submitting the project is Week 13 of the semester

There are two types of components that must be implemented:

Servers, that maintain the global status of the application and provide its main functionality. Servers may serve a different geographical area, and/or follow a specific purpose within the application. Servers communicate with each other in order to update the status, maintain consistency, synchronize data, and so on. For technologies, feel free to choose one that you already know and can handle the business requirements of the application - please find some examples below:

Clients. They provide the users with a lightweight and possibly mobile interface and communicate with the servers. There can be several types of clients: standard standalone applications featuring a platform-specific UI, Web applications, mobile, etc.

Clients and servers may communicate through technologies and techniques of choice, such as: HTTP, REST, SOAP, OData, WebSockets, RMI, etc.

Depending on the desired functionality, the servers/clients can also communicate with externally provided services such as Maps API, Social Media queries, object storage for images and media, etc.

The application must consist of combination of at least 3 components. (e.g. 2 servers and 1 client, 1 client 2 servers, etc). The team must setup and provide access to a git repository for the source code, available from the beginning of the project implementation.

Please keep in mind that the application must be designed to be "cloud-native", i.e. it was purposely built to be deployed to a hyperscaler such as Azure, Google Cloud Platform, Amazon Web Services out of the box.

Each team will present its git repository after choosing a project, update its readme.md with the specification and demo the application from within the cloud platform of choice.

Example Application

Medical Supplies Procurement Platform

Allows Suppliers and Buyers to view, post, place orders, bids and transact medical products.

Users will be able to connect to the platform via a web application (React) or an Android App.

They will have the option to register, view products, submit products (with pictures), place bids and retrieve other user's email address to contact them.

The gateway server is a Java application built with the help of the Spring framework. It will provide the client applications with REST endpoints through which registration, bidding, viewing and submission services are exposed. The payloads will be encoded in json format, and the data transfer objects will mirror the entities stored in the database.

To avoid overloading the server when the number of orders (bids and postings) exceed the maximum available throughput offered by the underlying hardware, the gateway server will connect to a Kafka broker and pass the incoming orders as messages on a topic. Before pushing them to Kafka, the orders are enriched with a unique identifier, to be used as a key when referencing them downstream.

The gateway Java server will handle all user management actions (create user, change password, update shipping address) but will pass the responsibility of order fulfillment to another server, connected to the same kafka broker.

The processing server is small microservice running node, listening to the kafka topic the gateway server pushes to. It will process the order, call external APIs and save it to in database.

When submitting a medical product, the node server server will automatically check if the product is FDA approved and will label its entry in the catalog appropriately. This is done via an external call to the 510(k) API provided for free by the US government (https://open.fda.gov/apis/device/510k/how-to-use-the-endpoint/). This processing server will persist all products, bids and offers in a database - mongodb.

Since the fulfillment of the order takes place asynchronously, the UI will not be aware (the moment when the gateway server returns the ajax call) whether the order was successfully placed or not. Thus, a polling strategy will be applied by the UI that would call the order/status gateway server endpoint every 2 seconds allowing the end user to be informed of the status of its order in real time.

Technologies, Concepts and Use-Cases

Messaging queues

The most relevant use-cases for a messaging queue is:

Messaging - Useful when transmitting messages between system components (e.g. financial transactions between two servers, product orders between website back end and fulfillment center system).

Metrics - Centralize and aggregate various information about an application (e.g. concurrent number of users, load on CPU per minute, memory used, specific features accessed and their latency). An example use case would be recording concurrent number of users in a video streaming application in order to display live dashboards with top viewed streams.

Website Activity Tracking - Register where a client clicked in a web page, identify user sessions and map their journey on the website. (e.g. record the flow of a user searching for a product, filtering, browsing different items then buying the specific item in an online store, in order to further optimize the process)

Stream Processing - Useful when designing components in the Pipes and Filters / Smart endpoints, dump pipes architectural pattern. Components would only be required to connect to topics/queues, process and filter the information on that topic then output it further down in the pipeline.

Example technologies: Apache Kafka, RabbitMQ

API

A server-side web API is a programmatic interface consisting of one or more publicly exposed endpoints to a defined requestresponse message system, typically expressed in JSON or XML, which is exposed via the webmost commonly by means of an HTTP-based web server (more info here).

Front end & Back End

In software architecture, there may be many layers between the hardware and end user. The front is an abstraction, simplifying the underlying component by providing a user-friendly interface, while the back usually handles data storage and business logic.

Git

Git - most used version control system. Git tracks the changes made by you to files, so you have a record of what has been done, and you can revert to specific versions should you ever need to. Git also makes collaboration easier, allowing changes by multiple people to all be merged into one source. Find out more here.

Microservices - Service Oriented Architecture

What sets a microservices architecture apart from more traditional, monolithic approaches is how it breaks an app down into its core functions. Each function is called a service, and can be built and deployed independently, meaning individual services can function (and fail) without negatively affecting the others.

This helps you to embrace the technology side of DevOps and make constant iteration and delivery (CI/CD) more seamless and achievable. (whitepaper here, great article with illustrations here)

Cloud Native

Cloud native computing uses an open source software stack to be:

More info on the principles of Cloud Native here, deploying to GCP, develop and deploy to AWS.

Databases

SQL

Have a rigid data storage model, with a predefined schema describing tables with fixed rows and columns. General purpose databases, only scalable vertically (upgrade current servers resources or replace with a better one), requiring joins to compound data.

Examples: Oracle, MySQL, Microsoft SQL Server, and PostgreSQL

NoSQL

Have a flexible data storage model, based on the type of database. Document: JSON documents, Key-value: key-value pairs, Wide-column: tables with rows and dynamic columns, Graph: nodes and edges.

Able to scale horizontally (add servers based on load), flexible schema with multiple purpose, as follows - Document: general purpose, Key-value: large amounts of data with simple lookup queries, Wide-column: large amounts of data with predictable query patterns, Graph: analyzing and traversing relationships between connected data

Examples: Document: MongoDB and CouchDB, Key-value: Redis and DynamoDB, Wide-column: Cassandra and HBase, Graph: Neo4j and Amazon Neptune

Vertical vs Horizontal Scaling

Horizontal scaling means that you scale by adding more machines into your pool of resources whereas Vertical scaling means that you scale by adding more power (CPU, RAM) to an existing machine.

An easy way to remember this is to think of a machine on a server rack, we add more machines across the horizontal direction and add more resources to a machine in the vertical direction.

Resources

BSD Sockets [in Romanian]
Chapter 2 in a book we wrote in the past: D. Cosma, S. Veres, A. P. Mierlutiu: Aplicatii software distribuite, Editura de Vest, 2003
Unix Sockets Tutorial [EN]
Programarea folosind BSD sockets
Designing an Application-Level Communication Protocol
excerpts from the course.