Programarea aplicatiilor distribuite
Distributed Programming

Pagina laboratorului/Lab Page

 Autori: Dan Cosma, Vlad Roșculescu-Nemeș, Petra Csereoka
 Actualizat: 15.02.2021

Laboratorul de PAD recomanda Linux si Java

Stimati studenti,

Bine ati venit pe pagina de Web a laboratorului de PAD! In cadrul acestui laborator, veti avea posibilitatea sa faceti cunostinta cu unul din cele mai interesante domenii din lumea calculatoarelor, sistemele software distribuite. Vom porni de la programarea folosind BSD Sockets (primitive de baza, fara de care aplicatiile in retea moderne si chiar Internet-ul nu ar putea fi concepute) si vom ajunge sa discutam cele mai reprezentative tipuri de tehnologii distribuite existente la ora actuala. Abordarea va fi una pragmatica, de-a lungul semestrului urmand sa implementati doua mini-proiecte complete, care va vor ajuta sa intelegeti o parte importanta din problemele care se pun in cazul aplicatiilor software distribuite reale.

Va uram mult succes!

Echipa PAD

Part 1. BSD Sockets

The first assignment deals with the implementation of a small TCP application using BSD sockets. The application must consist as a server and at least one client that use the services provided by the server. Teams will propose specifications for the application by sending them by e-mail to the lab supervisor. Examples below.

The following tasks must be accomplished:

The deadline for submitting the project is Week 4 of the semester.

Examples of projects [RO]

Part 2. Distributed application

This part focuses on building from scratch a distributed application on Java EE or another platform of choice. The platform and the actual specification of the assignment must be negotiated/discussed with the lab coordinator.

The deadline for the submission of this project is Week 13.

The application consists of several distributed, interconnected software components, that work together for a common goal. There are a minimum of two types of components that must be implemented:

Clients and servers may communicate through technologies and techniques of choice, such as: HTTP, REST, messaging services, RMI, etc.

Requirements that must be met:


Examples of Technologies, Concepts and Use-Cases

Messaging queues

The most relevant use-cases for a messaging queue is:

Messaging - Useful when transmitting messages between system components (e.g. financial transactions between two servers, product orders between website back end and fulfillment center system).

Metrics - Centralize and aggregate various information about an application (e.g. concurrent number of users, load on CPU per minute, memory used, specific features accessed and their latency). An example use case would be recording concurrent number of users in a video streaming application in order to display live dashboards with top viewed streams.

Website Activity Tracking - Register where a client clicked in a web page, identify user sessions and map their journey on the website. (e.g. record the flow of a user searching for a product, filtering, browsing different items then buying the specific item in an online store, in order to further optimize the process)

Stream Processing - Useful when designing components in the Pipes and Filters / Smart endpoints, dump pipes architectural pattern. Components would only be required to connect to topics/queues, process and filter the information on that topic then output it further down in the pipeline.

Example technologies: Apache Kafka, RabbitMQ

API

A server-side web API is a programmatic interface consisting of one or more publicly exposed endpoints to a defined requestresponse message system, typically expressed in JSON or XML, which is exposed via the webmost commonly by means of an HTTP-based web server (more info here).

Front end & Back End

In software architecture, there may be many layers between the hardware and end user. The front is an abstraction, simplifying the underlying component by providing a user-friendly interface, while the back usually handles data storage and business logic.

Git

Git - most used version control system. Git tracks the changes made by you to files, so you have a record of what has been done, and you can revert to specific versions should you ever need to. Git also makes collaboration easier, allowing changes by multiple people to all be merged into one source. Find out more here.

Microservices - Service Oriented Architecture

What sets a microservices architecture apart from more traditional, monolithic approaches is how it breaks an app down into its core functions. Each function is called a service, and can be built and deployed independently, meaning individual services can function (and fail) without negatively affecting the others.

This helps you to embrace the technology side of DevOps and make constant iteration and delivery (CI/CD) more seamless and achievable. (whitepaper here, great article with illustrations here)

Cloud Native

Cloud native computing uses an open source software stack to be:

More info on the principles of Cloud Native here, deploying to GCP, develop and deploy to AWS.

Databases

SQL

Have a rigid data storage model, with a predefined schema describing tables with fixed rows and columns. General purpose databases, only scalable vertically (upgrade current servers resources or replace with a better one), requiring joins to compound data.

Examples: Oracle, MySQL, Microsoft SQL Server, and PostgreSQL

NoSQL

Have a flexible data storage model, based on the type of database. Document: JSON documents, Key-value: key-value pairs, Wide-column: tables with rows and dynamic columns, Graph: nodes and edges.

Able to scale horizontally (add servers based on load), flexible schema with multiple purpose, as follows - Document: general purpose, Key-value: large amounts of data with simple lookup queries, Wide-column: large amounts of data with predictable query patterns, Graph: analyzing and traversing relationships between connected data

Examples: Document: MongoDB and CouchDB, Key-value: Redis and DynamoDB, Wide-column: Cassandra and HBase, Graph: Neo4j and Amazon Neptune

Vertical vs Horizontal Scaling

Horizontal scaling means that you scale by adding more machines into your pool of resources whereas Vertical scaling means that you scale by adding more power (CPU, RAM) to an existing machine.

An easy way to remember this is to think of a machine on a server rack, we add more machines across the horizontal direction and add more resources to a machine in the vertical direction.


Example Application

Medical Supplies Procurement Platform

Allows Suppliers and Buyers to view, post, place orders, bids and transact medical products.

Users will be able to connect to the platform via a web application (Angular) or an Android App.

They will have the option to register, view products, submit products (with pictures), place bids and retrieve other user's email address to contact them.

The gateway server is a Java application built with the help of the Spring framework. It will provide the client applications with REST endpoints through which registration, bidding, viewing and submission services are exposed. The payloads will be encoded in json format, and the data transfer objects will mirror the entities stored in the database.

To avoid overloading the server when the number of orders (bids and postings) exceed the maximum available throughput offered by the underlying hardware, the gateway server will connect to a Kafka broker and pass the incoming orders as messages on a topic. Before pushing them to Kafka, the orders are enriched with a unique identifier, to be used as a key when referencing them downstream.

The gateway Java server will handle all user management actions (create user, change password, update shipping address) but will pass the responsibility of order fulfillment to another server, connected to the same kafka broker.

The processing server is small microservice running node, listening to the kafka topic the gateway server pushes to. It will process the order, call external APIs and save it to in database.

When submitting a medical product, the node server server will automatically check if the product is FDA approved and will label its entry in the catalog appropriately. This is done via an external call to the 510(k) API provided for free by the US government (https://open.fda.gov/apis/device/510k/how-to-use-the-endpoint/). This processing server will persist all products, bids and offers in a database - mongodb.

Since the fulfillment of the order takes place asynchronously, the UI will not be aware (the moment when the gateway server returns the ajax call) whether the order was successfully placed or not. Thus, a polling strategy will be applied by the UI that would call the order/status gateway server endpoint every 2 seconds allowing the end user to be informed of the status of its order in real time.

Resources