Morph: a system for real-time data processing

Morph: a system for real-time data processing

September 14, 2024

With the aim of leveraging our expertise in data processing and data exploitation and achieving excellence, we have created Morph.

Morph is a product, a reusable infrastructure to create pipelines for real-time data processing easily and cost-effectively, which enables us to create various systems to tackle problems related to event detection and aggregated data analysis.

For example, within the Data & Artificial Intelligence team, we have developed a solution using Morph aimed at detecting incidents experienced by public transport users for TMB (Transports Metropolitans de Barcelona).

We used this technology for data ingestion both in real time and for batch processing and the use cases are quite diverse, such as retrieving a large number of tweets on a specific topic to detect the users’ sentiment or analyzing the evolution of the price of different cryptocurrencies.

The entry barrier for Data Acceleration is low and the approach can scale from a Proof of Concept (POC) to a cross-cutting project for any company, that is, it can grow from a very limited pilot project -not tied to the organization’s infrastructure- to a large-scale project. In short, Morph is an evolutionary process in which services are included progressively and old processes are updated.

From project to product: the evolution of Morph

Morph was born as an evolution of the projects we were working on with data architectures such as Lambda or Kappa. A group of people from the Data & Artificial Intelligence team took advantage of the first weeks of lockdown during the covid-19 crisis to dedicate some time to developing this product, which they considered highly portable and scalable.

The name Morph (defined in a brainstorming session) alludes to the process of transformation, metamorphosis, but also polymorphism. It also refers to the idea of creating a useful and reusable packaged solution to help our customers maximize the value of their data.

Morph’s main advantage is that the infrastructure can be deployed through Kubernetes, thus we can perform a cloud provider-independent installation or even on-premise.

In addition, the components are configurable with auto-scaling policies to optimize resources, which enable us to reduce computing costs to the minimum at any time.

Collect, build and analyze

How does Morph work? The process is made of four phases:

  • The first step consists of collecting and extracting data from various sources, both open and proprietary.
  • Then, after cleaning, standardizing, and selecting the relevant information from the different sources, a combined data lake is built.
  • Next, we define metrics, generate aggregate data, and analyze the repository.
  • Finally, we generate tables, graphics, and reports to provide useful business-oriented visualizations.

Morph reference architecture

Sngular's Data & AI team counts on a diverse group of experts for these tasks. Data engineers design the data flow, carry out the required transformations and define its storage structure. We also need DevOps specialists to configure the resources associated with the deployment on Kubernetes. Finally, data visualization specialists build, deploy and configure dashboards for data visualization with tools such as Google Data Studio, Tableau, or PowerBI.

The democratization of distributed, high-performance computing enables us to set up these types of systems quickly, efficiently, and at a low cost, reducing uncertainty and making their improvement and maintenance easier.

Looking to the future, Luis Mesa, Data Engineer & Cloud Architect in Sngular's AI team, believes that it is in the access and use of data from the client's sources where more effort needs to be invested. "For example, it’s much better to obtain data through an API - which permits filtering, limiting, and grouping them - than to launch a batch process that reads CSV data stored daily on an FTP server," says Luis.

There is no doubt that real-time information enables us to make more realistic and accurate decisions. All the actions or events that we try to identify or predict with this technology are aimed at reducing our clients’ response time or increasing their anticipation skills to maximize profit.

Morph provides solidity and high availability, scalability, quick deployment, flexibility, and freedom.

Morph’s key features

  1. Morph is extremely solid as it’s deployed on top of Kubernetes. If a node goes down, components are automatically rebalanced while another node is provided.
  2. Designed to be scalable. It only uses the resources it needs to run at any time. If the processing requirements grow, Morph can automatically increase its power.
  3. Deployment is practically immediate. You only need to create or adapt some connectors to ingest or transfer data from various sources to be able to operate with them.
  4. No commitments to specific systems or databases. Morph works with any vendor or managed service, even in environments based on hybrid cloud solutions.
  5. No vendor lock-in: Morph can be deployed on all public cloud providers and on-premise. It only requires the creation of a Kubernetes cluster.

The possible applications are many, we look forward to showing them to you and explaining in detail how Morph works. If you wish to know more, don’t hesitate to contact us at alfred@sngular.com.