The idea of Startup and the theoretical foundation for big data, business modules, data center, and the data retrieval
Basic Idea
For the design of a new startup application, the most important thing is to clearly sort out the business model, analyze the business model, and make the system architecture based on the business characteristics to avoid many detours.
First, you have to determine the core functions required by each module of your business operations, the business modules guarantee the basic service capabilities, and then extract and encapsulate the basic functions required by each business module, split out business services and public services, and support business capabilities.
The system splits the major parts of business activities into data centers and big data underlying capabilities and requires that there is no strong coupling relationship between each large module to ensure that the modules can be expanded independently
Then determine the mode of collaboration between the various modules, such as the communication capabilities between the business and the data center, interface standards, data security, and other details, or the data handling mode between the data center and the underlying big data, to ensure data circulation capabilities;
Finally, the specific details of each module are implemented. What needs to be considered is that according to the business model, if the same components and architecture methods can be selected, try to unify the architecture selection and component dependencies, and reduce the barriers between different modules
Different startup ideas have very big differences in their industry, specific business, development and management level, information level, and technical background.

Modules for startup operations
The division and design of modules based on business scenarios, as well as the basic construction of public services, ensure that the structure of the business operations is reasonable and scalable. The basic consideration for whether they are reasonable is whether the continuous new business scenarios require drastic revisions of the system. If the service capacity is continuously enriched, the cost of system transformation is small, and the natural structure is reasonable.
Customer operation
Each customer’s access needs a complete set of procedures, service descriptions, billing rules, contract management, recharge, service activation and deactivation, billing, and a series of supporting functions, usually, there are two entrances to the customer login End and the operation side.
Payment and settlement
The most complex system module that provides payment capabilities, such as aggregating multiple payment channels to solve the customer’s recharge and refund, or the service party’s own payment needs, and provides various settlement bill data output and reconciliation the ability to balance accounts.
Order management
Each customer’s request, or the use of each service, requires detailed order records for billing actions that involve unit price, order number, and time. As the core basis for settlement, it is also the most concentrated business data. The place where the outbreak occurred.
Permission system
In the startup system, the design of the permission system focuses more on solving the needs of the main body of the application. Different business teams are responsible for different service operations, customer management, etc., a clear and systematic permission management is required for different roles.
Log integration
In the detailed log system, normal business log data can be used for data completion analysis when the service is abnormal, and abnormal log data can be used for developers to analyze system problems and bottlenecks to continuously optimize service capabilities.

Different aspects of data retrieval
Data retrieval usually has many types of models, which leads to complex system architecture and business. Different businesses have their own capabilities and complexity, data management itself is not easy, therefore, in the early stage of system architecture, business scenarios considering data retrieval capabilities.
API service
Data retrieval based on HTTP mode, which can obtain data through request, such as risk control model, scoring, anti-fraud and other services.
Platform services
A comprehensive capability integration system, with low customer demand for customized services, and complete process data retrieval capabilities, such as an automated digital marketing platform that provides full process management capabilities for marketing
Customer-related data
Generally, customers submit related click events in a way of buying points, and the system conducts summary analysis based on omnichannel and provides feedback for customers.
Visual analysis
This is divided into two major areas, data analysis, and visualization. Data can be loaded into multiple data sources for joint analysis, and highly automated analysis based on front-end components, such as common data insight systems
In this scenario, different businesses need data support in their respective scenarios, but different businesses need the same basic functions such as operations, settlement, orders. To understand different business scenarios, it is very simple to find common points and differences. The similarities are developed in public services, and the business differences are developed in independent services to facilitate the continuous expansion and evolution of the system.

Importance of data center
The independent architecture deployment of the data center is a very necessary function in a new startup, most of the data is interlinked. The linkage processing between data does not need to be coupled to the business level. The flow of data, correction, security management can all be done in the data center.
Different business modules need to rely on core data capabilities. The core data supporting capabilities are usually deployed separately and provide various service scenarios, which are usually understood as data centers. At the same time, business modules themselves will also generate various data.
Service capability
As a public dependency of multiple businesses, the data center not only needs to provide data-based query capabilities but also needs to provide certain scheduling and computer systems when processing massive data tasks.
Deployment method
According to the characteristics of the data, it will usually be stored in a variety of ways such as clusters, sub-databases and tables, OLAP engines, and data warehouses. According to the characteristics of the data, unified service capabilities are provided and open to the business intelligence.
Data update
Data needs to be updated in real-time or regularly. The source of data is usually calculated data and processed by big data methods, and data that is incorrectly verified by the business modules, or data that is continuously optimized during use.

What modules needed for big data
The underlying component of big data is the core capability of the system. Accurate calculation and analysis of data ensure service capabilities, and continuous automation and tool-based management of the existing architecture are very important. The process of massive data management is more manual, which means that the efficiency is lower, especially in the process of pushing data to the data center or receiving data at the bottom layer, it is necessary to agree on a strategy to ensure the safe and stable automatic transmission of data.
The large volume of data requires massive data processing capabilities, so many big data component technologies are used to store, calculate, analyze, and move data.
Data storage
The most common storage at the bottom of big data is in the form of files, structured database storage, semi-structured log files, and some unstructured data.
Computing power
The processing of massive data needs to rely on a variety of parallel computing, offline tasks, real-time computing, and other methods to achieve the purpose of rapid processing.
Data handling
After data processing is completed, service capabilities are not directly provided at the bottom layer. Data is usually synchronized to the upper data center to provide service capabilities for the business. The handling here can be data output or data input to be processed.