Digital Nomad
Image for post
Image for post

Bitcoin and Ethereum are both decentralized entities based on blockchain technology. They are all powered by their own currencies BTC and ETH respectively, and more importantly, these currencies can be used as value exchange outside of their respective ecosystems.

Bitcoin was born after the 2008 financial crisis, because public confidence in banks and financial institutions is at a historically low point. The purpose of this new digital currency is to serve as a viable alternative to conventional funding while allowing people to have complete control over their financial situation.

Bitcoin, as a global decentralized financial system, is a payment transaction medium and has the ability to become a store of digital value. Send bitcoins at any time and anywhere without involving any third parties. …


Image for post
Image for post

Apache Beam is a unified programming framework open sourced by Google. It is not a streaming platform in itself. Instead, it provides a unified programming model to help users create their own data processing pipelines and realize batch processing and processing that can run on execution engine.

What Apache Beam offers

  • A unified programming model that can cover batch and stream processing
  • Beam SDK, supports Java and Python
  • A series of Runner (can be understood as “adapter”), the programming model run on different underlying processing engines (Google Cloud Dataflow, Spark, Flink)

Abstractly represents a potentially distributed multi-element data set. You can also think of PCollection as “pipeline” data, the Beam transformation occurs by PCollection objects as input and output, if you want to process data in the pipeline, you must adopt the form of PCollection. …


Image for post
Image for post

Although data analysis is hidden behind the business system, it has a very important role. The results of data analysis play a pivotal role in decision-making and business development. With the development of big data technology, the exposure of proper terms such as data mining and data exploration is getting higher and higher, but before big data analysis systems similar to the Hadoop become popular, data analysis work has undergone considerable development, especially data analysis based on BI systems already has very mature and stable technical solutions and ecosystems.


Image for post
Image for post

Let’s first outline the uses of data warehouses and data platforms in the Internet industry:

  • Integrate all business data of the company and establish a unified data center.
  • Provide various reports, some for senior management and some for each business.
  • To provide operational data support for website operations is to use data to allow operations to understand the operational effects of the website and products in a timely manner.
  • Provide online or offline data support for various businesses and become the company’s unified data exchange and provision platform.
  • Analyze user behavior data and use data mining to reduce input costs and improve input effects such as targeted advertising and personalized recommendations for…

Image for post
Image for post

Arduino is a development framework, not a chip, nor a circuit board. It can support the development of many types of processor chips, and there are many libraries inside. The software and hardware development methods have obvious building blocks, and the development of applications is simple, convenient and fast.

Arduino is a platform Arduino is just an open source development platform implemented with java and gnu, and its structure is derived from the Processing software development tool made by art lovers. It can support a variety of MCUs, including Atmel’s AtmelTiny series, avr8, ARM Cortex M0, ARM Cortex M3, ST’s ARM Cortex M3 and M4, etc. …


Image for post
Image for post

Using Raspberry Pi, microSD card and power supply, a simple desktop can be made. You also need an HDMI cable and a suitable monitor, maybe an old monitor. A USB keyboard and mouse are also required.

The version of Raspberry Pi 3 also has built-in Wi-Fi and Bluetooth. If you use other models, you need a compatible USB dongle.

After everything is set up and the preferred operating system (the latest version of Raspbian) is installed, the desktop computer can be used.

Many estimates indicate that one of the main uses of Raspberry Pi is the Kodi Media Center. Some Kodi versions have been released as disk images. …


Image for post
Image for post

In the processing of big data, the different big data frameworks play a key role , through the use of big data system frameworks, the integrated processing of large-scale data is become easy and to get and extract intelligence and other useful reports is also simple. From the perspective of manual statistical analysis and today’s distributed computing platforms are keystone behind the rapid increase in data processing speed and the continuous evolution of the overall architecture. Nowadays, there are many big data frameworks available on the market. The most popular ones are Hadoop, Spark and Storm. …


Image for post
Image for post

In 1993, Edgar F. Codd, the founder of relational databases, proposed the concept of online analytical processing (OLAP). Essentially, it is the concept of multidimensional database and multidimensional analysis capabilities. The goal is to meet the specific query and report requirements of decision support or multidimensional environments. After the arrival of the Internet era, the surge in data volume has also brought new challenges to relational databases. The most obvious challenges are as follows:

Expansion cost of data column is huge

Because the relational database defines the fields of the Table in advance, when the database already has hundreds of millions of data, the business scenario needs a new column of data. You are surprised to find that under the rule of the relational database, It is necessary to operate these hundreds of millions of data at the same time to complete the addition of a new column (otherwise the database will report errors), which poses a great challenge to server performance in the production environment. …


Image for post
Image for post

In the field of data processing, we are generally divided into online transaction processing (OLTP, Online Transaction Process) and online analysis processing (OLAP, Online Analysis Process). Take shopping as an example online transaction processing is to ensure that the same product is not purchased by multiple people. Online analysis and processing is to count how many people have purchased this product.

Kylin is a big data analysis engine built on the Hadoop platform. On the PB data (1PB=1000TB) data set, a tool that can return summarized data in seconds. Let me give you an example of the ability to summarize data, for example, I want to know the total score of each person in my game. This is doing data aggregation. This ability is amazing. …


Image for post
Image for post

Looking back on the 10 years of evolution of distributed computing systems, we can more easily recognize the relative positions of Spark and Ray. In 2004, Google proposed MapReduce as a cluster programming framework, and cooperated with Google File System and other technologies as the support of the underlying storage. After more than 10 years, MapReduce became popular.

The reason for its success is that it provides programmers and data scientists with a very good understanding, rich expressiveness, high fault tolerance, and it is easy to implement a distributed system architecture based on commercial hardware (commodity devices).

Then in 2010, with the concept of memory cloud proposed by Stanford, researchers realized that memory, which seemed to be very expensive, was becoming cheap, and many fault-tolerant operations that were highly dependent on disk could actually be implemented in memory. In this context, Spark came into being, giving birth to RDD and a series of memory-based optimization technologies, replacing the original disk-based frameworks such as Hadoop Hive in small and medium-scale computing. But so far, Hive has not been completely replaced by this. In the very large-scale computing (PB level) scenario, it relies on SSD and super robustness, which is still the first choice of many companies. …

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store