[EN],ipfs,DNS,Introduction to IPFS

 

Introduction to IPFS

cubly.ru
12 min
November 17, 2021

Let's find out what IPFS is.

IPFS (InterPlanetary File System) is a distributed system for storing and accessing files, websites, applications and data.

IPFS is a peer-to-peer storage network. Content is available through peers located anywhere in the world that can transmit information, store it, or do both. The principle of operation of IPFS is similar to the work of BitTorrent clients, IPFS also uses distributed hash tables ( DHT ) and is able to find network participants in seconds who can share the requested files. To understand the benefits of this, it is worth understanding the difference in how we surf the Internet we are used to and how this differs from the approach offered by IPFS.

When we use a browser, we use a URL ( Uniform Resource Locator ) to access content , which tells us where we can find that content. An example of such an address ishttps://mysite.com/my_perfect_book.pdf

What can we understand from this address? The file is called my_perfect_book.pdf, located on the site mysite.com, whose IP address we can find out using DNS . We do not know anything about the content of the content until we download it, moreover, the content can change over time, so we can download something completely different from what we expected . There is also a risk that the materials will be censored (perhaps even by mistake ), or the site will simply cease to exist.

In IPFS, unlike the classic web, addresses of the following form are used: QmXoypizjW3WknFiJnKLwHCnL72vedxjQkDDP1mXWo6uco, called Content Identifier (abbreviated as CID ). Unlike traditional URLs , which indicate where a file is located, IPFS addresses are created based on the content of the content (more specifically, a cryptographic hash is calculated ). This gives us two benefits:

  1. We can always check that the data we downloaded has not been tampered with by anyone, simply by calculating the hash function for him. If even one byte of the file is different, the hashes simply won't match, and IPFS will discard the file.

  2. Since the address does not indicate “where”, but “what” data to get, it becomes much more difficult to censor it. Files can be located in several storages at once, in the IPFS user cache, so even if the entire external Internet is completely blocked, the data can be obtained if it remains in the storage of at least one reachable peer.

How is IPFS arranged? 

To understand IPFS, you need to understand 3 fundamental principles:

  1. Unique Resource Identification Using Content Addressing

  2. Linking Content with Directed Acyclic Graphs (DAGs)

  3. Content Discovery with Distributed Hash Tables (DHTs)

Content addressing in IPFS 

We have already partially analyzed this point above, for each data image a Content Identifier ( CID ) is created - a hash sum that uniquely addresses this data. In fact, in addition to the hash sum, the CID contains the version (now there are two of them, version 0 CIDs start with Qm, version 1 CIDs are somewhat more complicated and contain the CID encoding method prefix and the content format so that programs know how to interpret the content) . There is a handy page on the official website that shows how to interpret CID.

What is Merkle DAG and how is it used in IPFS? 

Binary hash tree

A Merkle Tree (or Hash Tree) is a complete binary tree whose leaves contain the hashes of the data blocks, and whose inner nodes contain the hashes from adding the values ​​at their child nodes. IPFS uses a Content Identifier instead of hashes.

This structure allows IPFS to link complex structures, such as a whole folder of files or a git repository. Large files in IPFS are divided into chunks (the size of which depends on the type of content), respectively, when uploading a large file to IPFS, a hash tree will also be built, which has its own CID and points to the CID of each chunk. There is a special tool that visually shows such trees.

If we suddenly decide to change part of the file and publish a new version to IPFS, the hash of the leaves responsible for the changed data blocks will change, as a result of which the hashes of their parent elements will change, up to the root. However, the hash of those leaves whose block content has not changed will remain the same, respectively, the new and old file, although they will have different CIDs, will refer to the same CIDs of matching data blocks. Thus, content deduplication is achieved.

Content discovery with DHT 

A hash table is a regular table with two fields: a key and a value. A distributed hash table is a table that is shared among all peers in a distributed network. To find content, you need to ask your peers about it.

After IPFS knows where the content is located (more precisely, which peers hold the data blocks that make up the downloaded content), IPFS uses DHT again to find out where these peers are currently located (routing procedure) .

Now we know which peers have the content we want and how to contact those peers. Then, IPFS contacts these peers, sends them a wishlist (the CID of the blocks we want to get), and the peers send those blocks. After receiving data blocks, IPFS generates a CID based on the received data to check it against the CID we requested. Thus, you can check that the data has not been spoofed by anyone.

Where is the data stored? 

The data published in IPFS is stored on nodes - network members. If you have created your own torrent distributions, you know that initially only you have the files, you publish them and distribute them to other participants. If you do not have time to give your copy of the files to anyone and turn off your computer, no one will be able to download your data, so you need to wait until at least a few people download it and start distributing it before leaving the distribution yourself.

List of pinned files on local node

In IPFS, everything happens in a similar way. Once you publish any content to a local node, it is “pinned” ( pinning ), IPFS will keep it forever on your device until you unpin it ( unpin ). When someone requests the content you have, they will automatically save a copy of it to their local cache. However, it should be borne in mind that this cache is not permanent, therefore, over time, that peer will stop distributing this information if no one has accessed it for a long time. At the same time, no one forbids another peer to also “fix” the files downloaded from you, then he will continue to distribute them with you.

IPFS users assign to themselves the information that they consider valuable to themselves, so information that is not assigned to anyone (and therefore is of no value to anyone) is deleted over time to save space.

It is clear that keeping published information on your personal device and constantly distributing it is not always convenient, Pinning Services come to the rescue to solve this problem, in fact, it can be considered a cloud storage where you can upload your data so that they are always available in IPFS.

World Wide Web Integration 

To integrate with the existing web, there are two technologies that are used together: IPFS Gateway and DNSLink

IPFS Gateway

The IPFS Gateway is a server that allows you to access files on the IPFS network from browsers that do not yet directly support IPFS access. The gateway can be set up locally, or you can use a public one, for example https://ipfs.io . It should be noted that when using the gateway, you will not be able to check the CID of downloaded files on the client device, so you must trust the gateway, or not use it at all.

DNSLink 

DNSLink uses DNS TXT records to assign a CID to a specific domain. For example, we can find out that the ipfs.io domain points to a CID Qmf6DcRku4QUBjkToCSj3dMkhipUgg1NURYSMUFNsj1jsF:

$ dig +noall +answer TXT \_dnslink.ipfs.io
\_dnslink.ipfs.io. 30 IN TXT "dnslink=/ipns/website.ipfs.io"

$ dig +noall +answer TXT \_dnslink.website.ipfs.io
\_dnslink.website.ipfs.io. 30 IN TXT "dnslink=/ipfs/Qmf6DcRku4QUBjkToCSj3dMkhipUgg1NURYSMUFNsj1jsF"

As you can see from the example above, the TXT record _dnslink.ipfs.iopointed to the address /ipns/website.ipfs.io, and website.ipfs.io, in turn, points to a specific IPFS CID.

Here it is worth mentioning IPNS (InterPlanetary Name System) . IPNS allows you to create long addresses of the form k5..., which refer to a specific CID and can be updated by you. Thus, when uploading a new version of the file, you will not need to send out a new CID to everyone, users will immediately recognize it by accessing the data via IPNS. DNSLink is an alternative to IPNS, using domain names instead of long hashes. DNSLink is faster and easy to remember, however, be aware that DNS servers can be blocked or spoofed if you're not using DoH or DoT .

How to use 

You can install the ipfs-cli console utility or desktop client , which includes an IPFS node and a file manager, as well as a browser extension that will allow you to open IPFS links in your browser using a local node.

In this article, as an example, we will create a one-page resume site, publish it to IPFS using one of the pinning services, and set up DNS so that when our site is opened from a normal browser, users can see it.

Creating a One Page Resume Website 

We need to prepare a static version of the site, a simple set of pages. At this stage, we will not stop for a long time, on this page you can find step-by-step instructions for creating such a site using the Hugo generator ( and here you can find instructions in Russian ).



# Устанавливаем Hugo

snap install hugo

# Создаем основу вебсайта

hugo new site my-website
cd my-website

# Устанавливаем тему

# Список существующих тем: <https://themes.gohugo.io/>

git init
git submodule add <https://github.com/gurusabarish/hugo-profile.git> themes/hugo-profile

# Включаем тему

rm config.toml
cp ./themes/hugo-profile/website/v3.yaml ./config.yaml
sed -i 's/\\.\\/\\.\\./hugo-profile/g' ./config.yaml

# Редактируем созданную конфигурацию под себя

vim ./config.yaml

# Запускаем сервер для просмотра получившегося сайта

hugo serve

We follow the link http://localhost:1313 and see the resulting site:

One-page website based on the typical hugo-profile template by Guru Sabarish

Now we can enter the hugo command and our site will compile to the public folder. We will publish the contents of this folder in IPFS.

Publishing files to IPFS 

So that we do not have to keep our computer constantly on, we will use one of the Pinning services . I chose Pinata , it provides 1GB of free storage and doesn't require a credit card.

After registering and uploading the public folder to the site, we get a CID by which we can view our content:

The Content Identifier of the downloaded folder is highlighted in yellow

By clicking on the eye icon to the left, we can view our site in the browser (as you can see, I chose a different template for my site):

This site uploaded to IPFS and opened via IPFS Gateway

We can also open the IPFS Desktop application and view the contents of the site folder by entering the CID in the field at the top of the interface. The folder may not load immediately, you should wait until IPFS finds peers that have it.

The contents of our site folder uploaded to the IPFS network

Let's remember our CID, it will come in handy later when we set up our domain: QmZkSWaUY1dTeCNokmBdV9rS2SgRBGvTzbA2Xk49EPc6FE.

Configuring DNS, Part 1 

The last step is to associate our domain name with a Content Identifier (remember we talked about DNSLink ?). We will assume that you already have your own domain name and access to configure its DNS records.

I have at my disposal the domain name cubly.ru , whose DNS records are managed through Cloudflare. In order for the site to be accessible by our domain name to IPFS users, we need to create a TXT record with the name _dnslink.<доменное_имя>and value dnslink=/ipfs/<ваш_CID>.

TXT record pointing to the CID of my folder

After that, our site became available to IPFS users in addition to the address /ipfs/QmZk.. through a more concise and memorable address /ipns/cubly.ru ( view through IPFS Gateway ).

If we install the IPFS Companion plugin and try to open the cubly.ru site in a browser, the plugin will automatically detect that the site is available in IPFS (by finding the TXT record we created) and redirect to our local Gateway: http://cubly.ru.ipns. localhost:8080 . Our local IPFS Gateway will verify that all files downloaded by the browser are genuine by calculating their hash sums.

TXT record pointing to the CID of my folder

As you can see in the screenshot, the plugin has an option “Import to Files on My Node”. We can click on it to import the content of the site to our IPFS node and distribute the content of the site already from our computer so that the site opens faster for users who are in your city while IPFS Desktop is running. If you close it, the site will still be available, because it is on Pinata and other nodes that managed to cache it, just your neighbors will have to wait a bit until their IPFS node finds other peers.

DNS setup, part 2 

We've set up a DNSLink record and the site is accessible with a friendly name for IPFS users, but what about users who aren't using Web 3.0 yet?

Everything is simple to the point of banality: we will set up a CNAME record that will redirect all ordinary users to the public IPFS Gateway. Since I'm using Cloudflare, it makes sense to use their own IPFS Gateway: cloudflare-ipfs.com. You can read about it on the Cloudflare blog .

We just need to add a CNAME record to our domain and check that the site is opened from any device:

Adding a CNAME record to the cubly.ru domain

Keep in mind that according to the standard, a CNAME record cannot be applied to the root domain, Cloudflare automatically substitutes the cloudflare-ipfs.com IP address in our record. If your DNS provider does not know how to do this automatically, you can create an A-record pointing to the cloudflare-ipfs.com IP:

$ nslookup cloudflare-ipfs.com

Non-authoritative answer:
Name: cloudflare-ipfs.com
Address: 104.17.64.14
Name: cloudflare-ipfs.com
Address: 104.17.96.13
Name: cloudflare-ipfs.com
Address: 2606:4700::6811:400e
Name: cloudflare-ipfs.com
Address: 2606:4700::6811:600d
Create an A-record pointing to the Cloudflare IPFS Gateway IP

The site should now be accessible to all users, including those who do not use IPFS. As you can see from the screenshot below, IPFS Companion is disabled, but the site continues to open.

A site hosted in IPFS is available to all users

The result 

As a result, we got acquainted with IPFS and how it works, and also published our site on the public network, providing access to it for IPFS and Web users. While browsing a site, IPFS users will save its resources in their cache, increasing its availability, its loading speed and making it more distributed.

IPFS has many uses, here are a few:

Join the rise of the Distributed Web ✌️

Просмотры:

Коментарі

Популярні публікації