Sample Scramjet architecture for Big Data
The diagram below shows legacy and new content management systems, synchronized on the fly by a set of dedicated Scramjet platform instances responsible for detecting changes and saving data to new system database.
Use Case: Synchronize two production systems "on the fly"
Enterprises that process and store big data sets upgrade and replace their systems quite often. In addition to that, their IT architectures are complex. Dedicated systems require to have their own subsystem for data, even if there is a central transactional database or analytical data warehouse. Therefore, the need to synchronize two systems is quite common: either during go-live parallel run of legacy and old system or during day-to-day operations in complex IT setup.
Scramjet Cloud Platform keeps systems "synced" , offering a single solution to read, transform and save data to another destination.
Provider of content management systems in SaaS mode is rolling out a new platform for selected customers with extra features. They want to offer them access to the ""beta"" version in live mode (data synchronized between their current production version and beta version on the fly). They are looking for a simple, non-expensive, and flexible platform to synchronize such data as the beta version is evolving and changes in data transfer logic will be frequent.
Scramjet Cloud Platform can play the role of flexible ""synchronization pipeline"" in this use case: it will allow creating instances transferring data with minimal effort and updating them if needed.
Use Case: Simplify data collection pipelines
Big Data companies collect more and more data from multiple, heterogeneous data sources in various formats. Their data collection pipelines become more complex, difficult to maintain, and expensive.
Scramjet Cloud Platform can help to build one, unified platform for data wrangling and extraction.
The company provides market information and has systems for acquiring, wrangling, modifying formats) and data enrichment (obtaining additional information to the data). Unfortunately, the TCO (Total Costs of Ownership) of the data collecting and extracting system is growing faster than the company's revenues, which worries senior management.
Scramjet Cloud Platform makes data acquisition, formatting, filtering and enrichment systems could be much simpler and less expensive, and in addition, fully integrated with the machine learning part. In addition to the significant savings resulting from the elimination of redundant systems, it will also allow for savings in the work of programmers, which will reduce fixed costs and faster implementation time of new functions on the market.
Use Case: Enrich and minimize data in live mode
The current data landscape becomes more and more complex. Relying on one corporate data set from a major transactional system is no longer enough. Data must be augmented with additional sources of information and transformed into ""knowledge"", not just a series of string, boolean, and number fields. The legacy way to do this is to save raw data into storage and enrich it step by step, saving each stage into another storage. This approach is both very slow and produces massive amounts of duplicated data in various data persistence systems; increasing data storage, maintenance, and security costs.
Scramjet Cloud Platform can perform all these operations via chained instances and output data representing real knowledge, not just petabytes of duplicated data.
"A dynamic scaleup from the blockchain industry collects a massive amount of market data from various sources. Over time they gathered petabytes of archived data that they still need for training their statistical and Machine Learning models. In addition to that, day-to-day data volumes quadrupled over last year due to adding more data sources and the growing cryptocurrencies market. Now they face the challenge of the total reshaping of their data extraction and storage strategy. They want to focus on extracting meaningful information very early in the process and avoiding storage of unnecessary data in multiple places, creating one, distilled "golden copy" of data for both customers and internal data science teams.
Scramjet Cloud Platform technology can be a technology of choice here, as it will allow for creating complex data processing pipelines and processing all data ""in memory"" avoiding costly storage of raw, technical datasets.