Introduction

offchain::ipfs is Substrate, infused with IPFS.

Substrate is a blockchain framework built in Rust, with off-chain worker capabilities.
IPFS is a distributed file storage network, connecting peers and their content.

By including the Rust implementation IPFS in the native Substrate runtime, and by allowing pass-through wasm calls via Substrate's Off-chain Workers, we enable a powerful and familiar subset of the IPFS APIs, including:

  • ipfs add - Write data to IPFS
  • ipfs cat - Read data from IPFS
  • ipfs dht findpeer - Discover peers
  • ipfs dht findprovs - Discover content
  • ipfs swarm connect / disconnect - Swarm with other IPFS peers
  • ipfs pin add / rm - Pin and unpin content

offchain::ipfs allows you to account for your data transactions and DHT status in the blockchain. These on-chain insights can serve as a foundation for incentivized data storage and replication. This means no separate executable: both blockchain and distributed storage are together in one.

The offchain::ipfs Manual is the documentation of our efforts, as well as useful explanations and code examples to get you started using this technology. Due to offchain::ipfs being a well-maintained fork of paritytech/substrate, this manual also stands in as typical documentation, such as docs.rs and README.md files.

This manual is presented by: @koivunej, @ljedrz, @whalelephant, and @aphelionz

Disclaimers

You should still consider this an alpha preview.

The primary value of this work is the embedded IPFS node itself. The pallet included in the node-template binary is only meant as a showcase, and is just one of many possible realizations of offchain::ipfs.

Getting Started

You can get up and running in about 15 minutes by:

Why a Docker image?
offchain::ipfs is currently a well-maintained fork of paritytech/substrate. Until we are ready to make an upstream PR, we are using a Docker image for simplicity's sake.

Using the Docker image

The recommended way to use offchain::ipfs is via the eqlabs/offchain-ipfs image.

Installing the image

# Pull the image from Docker Hub
$ docker pull eqlabs/offchain-ipfs

The image comes with two binaries:

  1. The default node-template contains our custom pallet to preview the Offchain::ipfs functionalities through transactions
  2. The substrate binary does not have our custom pallets to interact with the IPFS node, instead you can connect to it through its multiaddr

The image exposes ports 9944 for WebSockets, 9933 for RPC, 30333 for p2p, and 9615 for Prometheus.

Running the image

The default command for the image is:

node-template --ws-external --rpc-external --base-path=/offchain-ipfs --dev

Run the default like so:

docker run -p 9944:9944 \
  -p 9933:9933 \
  -p 30333:30333 \
  -p 9615:9615 \
  -it \
  --rm \
  --name node-template \
  eqlabs/offchain-ipfs

To override the default and run substrate, for example:

docker run \
  -p 9944:9944 \
  -p 9933:9933 \
  -p 30333:30333 \
  -p 9615:9615 \
  -it \
  --rm \
  --name sub-ipfs \
  offchain-ipfs \
  substrate

This will work with any arguments you'd normally pass to substrate

Persistent Storage

To run with persistent storage volume between containers, first create a volume:

docker volume create offchain-ipfs-vol

Then add -v offchain-ipfs-vol:/offchain-ipfs to the docker run commands above.

Previewing the functionality in a UI

If you’re looking for a quick demo of the functionality, the simplest thing to do after running the Docker container's default command is to launch the substrate-front-end-template UI.

Instructions

  1. If you have node.js and yarn installed on your machine, run the following commands:

    git clone https://github.com/substrate-developer-hub/substrate-front-end-template
    cd substrate-front-end-template
    yarn install
    yarn start
    
  2. Once the UI opens in your browser, scroll down to the Pallet Interactor section at the bottom.

  3. Keep the default "Extrinsic" active, then select templateModule from the first dropdown.

  4. Then, select the callable you want from the list of callables that become available:

  5. An additional text field or fields will appear below the last select box. Type the arguments in and then click Signed.

  6. Watch your node logs and also the extrinsic events to the right for output and information.

Now what

This demo is based on our included templateModule pallet - mostly meant as a showcase of the embedded [Rust IPFS] node. In the next section we will walk you through this pallet, which will be instructive as a reference implementation.

Building an offchain::ipfs app

In this section we’ll cover the essentials on what you'll need to start building your application with offchain::ipfs. We do so by way of:

  1. Walking you through the example Substrate pallet that's included with offchain::ipfs
  2. Providing example code from two Rust clients (substrate-subxt and substrat-api-client), and one JavaScript client (polkadot.js).

We expect feedback on this pallet, but also we hope that the reference implementation will inspire builders to create their own pallets, expose their own JSON-RPC endpoints, and call them from their applications similarly.

How it all works

It helps, first, to have a basic understanding of how a request flows from a user of your application, through the Substrate offchain-worker, to the native runtime, over to IPFS, and then all the way back up to your application again.

  1. Once the chain is initialized or blocks are synced, the embedded Rust IPFS node is launched and connected to the offchain worker runtime. It will stay running in the background.
  2. The user makes a JSON-RPC call to submit an extrinsic to the node's runtime, using the callable functions exposed from the custom pallet.
  3. The request is added to the relevant queue in the Substrate storage database. This is also defined in the custom pallet.
  4. Upon import of specified blocks, the node's runtime passes the requests from the queues to an offchain worker.
  5. The offchain worker relays the desired requests to the Rust IPFS node, and the node returns futures resolving to results
  6. The offchain worker registers the results and relays them to the substrate runtime, which processes them and acts upon them as specified in the custom pallet.
  7. The offchain worker stops running

Example Pallet

offchain::ipfs comes with a showcase pallet, which is essentially a Rust module that complies with the requirements to be included within a substrate runtime.

This pallet is meant only as an example. We're including it to be helpful for future pallet authors that want to use the embedded native IPFS node to suit their needs.

If you're familiar with Substrate and the Framework for Runtime Aggregation of Modularized Entities (FRAME), you can simply view the source code for this pallet. Otherwise, read on as we go through the code step by step.

Please note the order in which these concepts are explained here is not necessarily the order that they appear in the code.

You can also learn more by following the Building a Custom Pallet tutorial.

Prelude

We start by using items from the native runtime. Our pallet is no_std since we're targeting Wasm

#![cfg_attr(not(feature = "std"), no_std)]
// ...
use sp_core::offchain::{
  Duration, IpfsRequest, IpfsResponse, OpaqueMultiaddr, Timestamp
};
// ...
use sp_runtime::offchain::ipfs;

Command Types

When your JSON-RPC calls are received by the pallet, the requests are expressed as ___Command enums and stored in the off-chain worker storage as a queue, to be ingested by your native runtime and passed to IPFS.

Derive attributes are omitted.

// Commands involved in peer-to-peer connections
enum ConnectionCommand {
    ConnectTo(OpaqueMultiaddr),
    DisconnectFrom(OpaqueMultiaddr),
}

// Commands that add, remove, pin, unpin, and output data
enum DataCommand {
    AddBytes(Vec<u8>),
    CatBytes(Vec<u8>),
    InsertPin(Vec<u8>),
    RemoveBlock(Vec<u8>),
    RemovePin(Vec<u8>),
}

// Commands that query the distributed hash table (DHT)
// for peers and content
enum DhtCommand {
    FindPeer(Vec<u8>),
    GetProviders(Vec<u8>),
}

The runtime configuration trait

The system::Trait trait (not to be confused with the Rust trait keyword), allows you to define which capabilities from the runtime you want to include, and how you want to use them. You can also "tightly couple" your pallet to other pallets by adding their Traits to your pallet's inherited trait list.

Here, however, we keep things simple by:

  1. Loosely coupling this pallet by leaving out inherited traits
  2. Including only the required Event type
/// The pallet's configuration trait.
pub trait Trait: system::Trait { // Use traits here to tightly couple to runtime
    /// The overarching event type.
    type Event: From<Event<Self>> + Into<<Self as system::Trait>::Event>;
}

Later in the code, we implement some helper functions on the Module struct. The function bodies are omitted for brevity's sake.

impl<T: Trait> Module<T> {
    // "Sends" a request to the local IPFS node by adding it to the offchain storage
    fn ipfs_request(req: IpfsRequest, deadline: impl Into<Option<Timestamp>>)
      -> Result<IpfsResponse, Error<T>>

    // Reads from the `ConnectionQueue` and connects / disconnects
    // from desired / undesired peers, respectively
    fn connection_housekeeping() -> Result<(), Error<T>>

    // Reads `FindPeer` and `GetProviders` commands from the `DhtQueue`,
    // and requests their execution from the native runtime
    fn handle_dht_requests() -> Result<(), Error<T>>

    // Reads `AddBytes`, `CatBytes`, `DataCommand`, `RemoveBlock`, `InsertPin`,
    // and `RemovePin` commands from the `DataQueue` and requests their
    // execution from the native runtime.
    fn handle_data_requests() -> Result<(), Error<T>>

    // Logs metadata (the number of connected peers) to the console at the DEBUG log level
    fn print_metadata() -> Result<(), Error<T>>

The decl_ macros

Pallets included in Substrate runtimes must adhere to the conventions of FRAME. In practice, this means you must implement decl_ macros:

  • decl_module!
  • decl_event!
  • decl_storage!
  • decl_error!

decl_storage!

Here, we define the data that will actually be stored on-chain when calling extrinsics.

Since the offchain-worker can't perform I/O outside of the wasm context, we store our requests as queues, to be processed on a periodic basis, consumed, and ultimately performed by the native runtime.

This is where we use the ConnectionQueue, DataQueue, and DhtQueue command types from above.

// This pallet's storage items.
decl_storage! {
    trait Store for Module<T: Trait> as TemplateModule {
        // A list of addresses to connect to and disconnect from.
        pub ConnectionQueue: Vec<ConnectionCommand>;
        // A queue of data to publish or obtain on IPFS.
        pub DataQueue: Vec<DataCommand>;
        // A list of requests to the DHT.
        pub DhtQueue: Vec<DhtCommand>;
    }
}

decl_event!

This is where we define what those events are and what they contain.

Once a command is sent to the off-chain worker, one of the following chain events is emitted.

// The pallet's events
decl_event!(
    pub enum Event<T> where AccountId = <T as system::Trait>::AccountId {
        ConnectionRequested(AccountId),
        DisconnectRequested(AccountId),
        QueuedDataToAdd(AccountId),
        QueuedDataToCat(AccountId),
        QueuedDataToPin(AccountId),
        QueuedDataToRemove(AccountId),
        QueuedDataToUnpin(AccountId),
        FindPeerIssued(AccountId),
        FindProvidersIssued(AccountId),
    }
);

decl_module!

This section, perhaps the most critical section of any given pallet, is where you can define functions that are exposed via JSON-RPC to client libraries and, by proxy, your users.

In practice, the bulk of what these functions do is to modify the DataQueue, DhtQueue, and ConnectionQueue storage objects by pushing signed command requests to their respective queues.

Some default types in the functions are omitted, but we've kept the #[weight] attributes around.

The Substrate docs define one unit of weight as "one picosecond of execution time on fixed reference hardware." These are essentially time limits for block creation, and can be (indirectly) mapped to transaction fees analogous to something like "gas fees." Read more about weights if you're curious.

// The pallet's dispatchable functions.
decl_module! {
    /// The module declaration.
    pub struct Module<T: Trait> for enum Call where origin: T::Origin {
        // Called at the beginning of every block before any extrinsics. Clears
        // `ConnectionQueue` and `DhtQueue` values every block, and clears
        // `DataQueue` every other block, since they should have been processed
        // Returns a weight of 0
        fn on_initialize(block_number: T::BlockNumber) -> Weight

        // Called at the beginning of every block to create extrinsics.
        // - `connection_housekeeping` and `handle_dht_requests` called every block
        // - `handle_data_requests` is called on every other block
        // - `print_metadata` is called every 5 blocks
        // blocks to alleviate some bandwidth and storage congestion
        fn offchain_worker(block_number: T::BlockNumber)

        /// Mark a `Multiaddr` as a desired connection target.
        #[weight = 100_000]
        pub fn ipfs_connect(origin, addr: Vec<u8>)

        /// Queues a `Multiaddr` to be disconnected
        #[weight = 500_000]
        pub fn ipfs_disconnect(origin, addr: Vec<u8>)

        /// Add arbitrary bytes to the IPFS repository.
        #[weight = 200_000]
        pub fn ipfs_add_bytes(origin, data: Vec<u8>)

        /// Find and output IPFS data pointed to by the given `Cid`
        #[weight = 100_000]
        pub fn ipfs_cat_bytes(origin, cid: Vec<u8>)

        /// Remove bytes from IPFS by `Cid`
        #[weight = 300_000]
        pub fn ipfs_remove_block(origin, cid: Vec<u8>)

        /// Pins a given `Cid` non-recursively.
        #[weight = 100_000]
        pub fn ipfs_insert_pin(origin, cid: Vec<u8>)

        /// Unpins a given `Cid` non-recursively.
        #[weight = 100_000]
        pub fn ipfs_remove_pin(origin, cid: Vec<u8>)

        /// Find addresses associated with the given `PeerId`.
        #[weight = 100_000]
        pub fn ipfs_dht_find_peer(origin, peer_id: Vec<u8>)

        /// Find the list of `PeerId`s known to be hosting the given `Cid`.
        #[weight = 100_000]
        pub fn ipfs_dht_find_providers(origin, cid: Vec<u8>)
    }
}

decl_error!

This is where we can define the myriad ways things can go wrong, as an enum.

// The pallet's errors
decl_error! {
    pub enum Error for Module<T: Trait> {
        CantCreateRequest,
        RequestTimeout,
        RequestFailed,
    }
}

Read on to see examples of how you can make calls to the this example pallet from your application.

Example pallet callable reference

The following callable functions are exposed via JSON-RPC by the example pallet. These specific functions were chosen due to popularity, to give you a familiar experience.

These functions are not the only ones available in the native runtime, and therefore do not represent the full extent of functionality available to you.

Regardless, we still detail the example template functions here as a helpful reference. If you have feedback about these existing functions, or would like to request new functions, please open an issue at rs-ipfs/substrate.

Callables

What follows is a list of callables, their frequency in terms of block creation, their weights, their arguments and return values. As explained in the section on the example pallet, weights are essentially picosecond representations of a time limit for a given transaction, and can be loosely correlated to transaction fees.

Rust's snake_case is used here. However, as you'll see in the upcoming polkadot.js example, JavaScript will use camelCase for function and variable names. Generally, the exposed JSON-RPC functions will adhere to the conventions of the programming language that the client code is written in.

Also, the weights here and block frequencies here are chosen rather arbitrarily without tokenomics in mind. You will probably need to tune these values in your own custom pallet.

Finally, while we list "return" values for simplicity's sake, the template pallet does not actually return any values from the RPC calls. What this really means in the context of Substrate is that the values will be eventually emitted returned in the runtime logs, which you'll need to monitor.

ipfs_add_bytes

Adds the given bytes to the IPFS repository.

Frequency: Every other block
Weight: 200,000

Arguments

  • bytes - The bytes that you want to add to IPFS, e.g. vec![1, 2, 3, 4] or b"1234"

Returns

A Content ID (CID) string, e.g. QmU1f6ngsoHvwtViihzLQPXCA8j3sagmvY9GJJDY7Ao7Aa

ipfs_cat_bytes

Displays the bytes behind a given CID.

Frequency: Every other block
Weight: 100,000

Arguments

  • cid: The CID of your desired content, e.g. QmY7Yh4UquoXHLPFo2XbhXkhBvFoPwmQUSa92pxnxjQuPU

Returns

The requested bytes - UTF-8 as a string, non-UTF-8 as hexidecimal string

ipfs_connect

Connects the embedded node to the given Multiaddr.

Frequency: Every block
Weight: 100,000

Arguments

  • multiaddr: A valid multiaddress with peer ID a the end, e.g. /ip4/104.131.131.82/tcp/4001/p2p/QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ

Returns

Nothing, or an error.

ipfs_disconnect

Disconnects from the given Multiaddr.

Frequency: Every block
Weight: 500,000

Arguments

  • multiaddr: A valid multiaddress with peer ID a the end, e.g. /ip4/104.131.131.82/tcp/4001/p2p/QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ

Returns

Nothing, or an error.

ipfs_dht_findpeer

Performs a search for the addresses associated with the provided PeerId.

Frequency: Every block
Weight: 100,000

Arguments

  • peerID: A PeerId hash, e.g. QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ

Returns

A multiaddr, such as /ip4/104.131.131.82/tcp/4001.

ipfs_dht_findproviders

Search for PeerIds known to be providing the given Cid. You must be connected to at least one peer.

Frequency: Every block
Weight: 100,000

Arguments

  • cid: The CID of your desired content, e.g. QmY7Yh4UquoXHLPFo2XbhXkhBvFoPwmQUSa92pxnxjQuPU

Returns

  • peerID: An array PeerId string, e.g. [QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ]

ipfs_insert_pin

Non-recursively pins a block with the specified Cid, protecting it from removal.

Frequency: Every other block
Weight: 100,000

Arguments

  • cid: The CID of the data you want to pin, e.g. QmU1f6ngsoHvwtViihzLQPXCA8j3sagmvY9GJJDY7Ao7Aa

Returns

Nothing, or an error.

ipfs_remove_pin

Removes a pin from a block, so that it is no longer persistent and can be removed.

Frequency: Every other block
Weight: 100,000

Arguments

  • cid: The CID of the data you want to unpin, e.g. QmU1f6ngsoHvwtViihzLQPXCA8j3sagmvY9GJJDY7Ao7Aa

Returns

ipfs_remove_block

Removes a block from the node’s repository.

Frequency: Every other block
Weight: 300,000

Arguments

  • cid: The CID of the block you want to remove, e.g. QmU1f6ngsoHvwtViihzLQPXCA8j3sagmvY9GJJDY7Ao7Aa

Returns

Nothing, or an error.

Using offchain::ipfs from your Rust code

With substrate-subxt

In your Cargo.toml file

[dependencies]
codec = { package = "parity-scale-codec", version = "1.3.5", default-features = false, features = ["derive"] }
substrate-subxt = "0.13.0"
sp-keyring = { version = "2.0.0", default-features = false }
async-std = { version = "1.6.4", features = ["attributes"] }

Then in your main.rs:

use codec::Encode;
use sp_keyring::AccountKeyring;
use substrate_subxt::{Call, ClientBuilder, EventsDecoder, NodeTemplateRuntime, PairSigner};

#[async_std::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Signer for the extrinsic
    let signer = PairSigner::<NodeTemplateRuntime, _>::new(AccountKeyring::Alice.pair());
    // API client, default to connect to 127.0.0.1:9944
    let client = ClientBuilder::<NodeTemplateRuntime>::new().build().await?;

    // Example CID for the example bytes added vec![1, 2, 3, 4]
    let cid = String::from("QmRgctVSR8hvGRDLv7c5H7BCji7m1VXRfEE1vW78CFagD7").into_bytes();
    // Example multiaddr to connect IPFS with
    let multiaddr = String::from(
        "/ip4/104.131.131.82/tcp/4001/p2p/QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ",
    )
    .into_bytes();
    // Example Peer Id
    let peer_id = String::from("QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ").into_bytes();

    // Begin to submit extrinsics
    // ipfs_add_bytes
    let add_bytes = client
        .watch(
            AddBytesCall {
                data: vec![1, 2, 3, 4],
            },
            &signer,
        )
        .await?;
    println!("\nResult for ipfs_add_bytes: {:?}", add_bytes);

    // ipfs_cat_bytes
    let cat_bytes = client
        .watch(CatBytesCall { cid: cid.clone() }, &signer)
        .await?;

    data: Vec<u8>,
}

impl Call<NodeTemplateRuntime> for AddBytesCall {
    const MODULE: &'static str = "TemplateModule";
    const FUNCTION: &'static str = "ipfs_add_bytes";
    fn events_decoder(_decoder: &mut EventsDecoder<NodeTemplateRuntime>) {}
}

#[derive(Encode)]
pub struct CatBytesCall {
    cid: Vec<u8>,
}

With substrate-api-client

In your Cargo.toml file:

[dependencies]
substrate-api-client = { git = "https://github.com/scs/substrate-api-client.git" }
sp-core = { version = "2.0.0", features = ["full_crypto"] }
sp-keyring = { version = "2.0.0", default-features = false } 

Then in your main.rs:

use sp_core::crypto::Pair;
use sp_keyring::AccountKeyring;
use std::{convert::TryFrom, string::String};
use substrate_api_client::{
    compose_call, compose_extrinsic_offline, extrinsic::xt_primitives::UncheckedExtrinsicV4,
    node_metadata::Metadata, Api, XtStatus,
};

fn main() {
    // instantiate an Api that connects to the given address
    let url = "127.0.0.1:9944";
    // if no signer is set in the whole program, we need to give to Api a specific type instead of an associated type
    // as during compilation the type needs to be defined.
    let signer = AccountKeyring::Bob.pair();

    // sets up api client and retrieves the node metadata
    let api = Api::new(format!("ws://{}", url)).set_signer(signer.clone());
    // gets the current nonce of Bob so we can increment it manually later
    let mut nonce = api.get_nonce().unwrap();

    // data from the node required in extrinsic
    let meta = Metadata::try_from(api.get_metadata()).unwrap();
    let genesis_hash = api.genesis_hash;
    let spec_version = api.runtime_version.spec_version;
    let transaction_version = api.runtime_version.transaction_version;

    // Example bytes to add
    let bytes_to_add: Vec<u8> = vec![1, 2, 3, 4];
    // Example CID for the example bytes added vec![1, 2, 3, 4]
    let cid = String::from("QmRgctVSR8hvGRDLv7c5H7BCji7m1VXRfEE1vW78CFagD7").into_bytes();

    // Example multiaddr to connect IPFS with
    let multiaddr = String::from(
        "/ip4/104.131.131.82/tcp/4001/p2p/QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ",
    )
    .into_bytes();

    // Example Peer Id
    let peer_id = String::from("QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ").into_bytes();

    // Create input for all calls
    let calls = vec![
        ("ipfs_add_bytes", bytes_to_add),
        ("ipfs_cat_bytes", cid.clone()),
        ("ipfs_connect", multiaddr.clone()),
        ("ipfs_insert_pin", cid.clone()),
    for call in calls {
        println!("\n Creating Extrinsic for {}", call.0);
        let _call = compose_call!(meta, "TemplateModule", call.0, call.1);
        let xt: UncheckedExtrinsicV4<_> = compose_extrinsic_offline!(
            signer,
            _call,
            nonce,
            Era::Immortal,
            genesis_hash,
            genesis_hash,
            spec_version,
            transaction_version
        );

        let blockh = api
            .send_extrinsic(xt.hex_encode(), XtStatus::Finalized)
            .unwrap();
        println!("Transaction got finalized in block {:?}", blockh);
        nonce += 1;
    }
}

For full demo with all pallet functions, please visit here

In JavaScript

This should work in both node.js and the browser via a bundler like Webpack or Parcel.

With polkadot.js

yarn init
yarn add @polkadot/api
// Import
const { ApiPromise, WsProvider, Keyring } = require('@polkadot/api');

;(async () => {
  const provider = new WsProvider('ws://localhost:9944');
  const api = await ApiPromise.create({
    provider,
    types: {
      ConnectionCommand: 'ConnectionCommand',
      DataCommand: 'DataCommand',
      DhtCommand: 'DhtCommand',
      Address: 'AccountId',
      LookupSource: 'AccountId'
    }
  });

  await api.isReady;

  const keyring = new Keyring({ type: 'sr25519' });
  const alice = keyring.addFromUri('//Alice');
  const module = api.tx.templateModule

  const PEER_ID = 'QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ'
  const PEER_MULTIADDR = `/ip4/104.131.131.82/tcp/4001/p2p/${PEER_ID}`
  const CID_1234 = 'QmU1f6ngsoHvwtViihzLQPXCA8j3sagmvY9GJJDY7Ao7Aa'

  // Connect to a peer
  // await module.ipfsConnect(PEER_MULTIADDR).signAndSend(alice, logEvents)

  // Add data as bytes (string or raw) to IPFS
  // await module.ipfsAddBytes('1234').signAndSend(alice, logEvents)

  // Cat (retrieve) data from IPFS
  // await module.ipfsCatBytes(CID_1234).signAndSend(alice, logEvents)

  // Disconnect from a peer
  // await module.ipfsDisconnect(PEER_MULTIADDR).signAndSend(alice, logEvents)

  // Locate a peer via the distributed hash table
  // await module.ipfsDhtFindPeer(PEER_ID).signAndSend(alice, logEvents)

  // Locate a peer that has the content you're seeing via the DHT
  // await module.ipfsDhtFindPeer(CID_1234).signAndSend(alice, logEvents)
})()

const logEvents = ({ status, events }) => {
  if (status.isInBlock) {
    console.log(`included in ${status.asInBlock}`);
  }

  if (status.isInBlock || status.isFinalized) {
    events.forEach((record) => {
      const { event, phase } = record;
      const types = event.typeDef;

      console.log(`\t${event.section}:${event.method}:: (phase=${phase.toString()})`);
      console.log(`\t\t${event.meta.documentation.toString()}`);

      event.data.forEach((data, index) => {
        console.log(`\t\t\t${types[index].type}: ${data.toString()}`);
      });
    })
  }
}

Debugging JSON-RPC in the browser

You will connect to the blockchain node via JSON-RPC over WebSockets., On page load it will connect to port 9944 via WebSockets.

One good way to monitor the streaming results in the browser is to use devtools:

The architecture of offchain::ipfs

As we explained in the introduction, offchain::ipfs is currently a fork of paritytech/substrate maintained by Equilibrium.

There are three branches of note:

  • master - which will always follow paritytech/substrate in lock-step, with no modifications
  • offchain_ipfs - which contains the modifications, and periodically rebases from master
  • offchain_ipfs_docker - rebases from offchain_ipfs and contains the Dockerfile updates

There may be other branches at any given time for pragmatic purposes, but the three above should always exist and be suitable for their respective roles.

In the rest of this chapter, we'll show you how to build the code in these branches, and then we'll take a closer look at what modifications were made to substrate to achieve offchain::ipfs.

Building from Source

You can build the docker image from source, or build the binaries locally. Both of these can take quite a long time, even on "developer" hardware - particularly the docker image.

Running the node from source

git clone https://github.com/rs-ipfs/substrate
cd substrate
git checkout offchain_ipfs
cargo build --workspace

Building the docker image from source

This is a multistage build based on Alpine linux. The resulting image will contain the substrate-ipfs binary and the node-template binary.

We suggest you supply your own tag name.

git clone https://github.com/rs-ipfs/substrate
cd substrate
git checkout offchain_ipfs_docker
docker build --file .maintain/Dockerfile --tag [your-tag-here] .

Refer to the Using the Docker image section of for information about running the image.

Read on to learn about the modifications in the offchain_ipfs branch, and what they do.

Substrate core modifications

Every one of offchain::ipfs's functional modifications to the paritytech/substrate core are encapsulated in a single commit on the offchain_ipfs branch of our repo.

In this section we'll walk through these modifications so that you may understand them, and perhaps improve upon them yourself.

How Substrate is organized

Substrate, as a modular framework, provides:

  1. Clients are services that interact with the substrate blockchain, e.g. substrate full and light client. Offchain workers are one of those clients
  2. Basic primitives to compose a blockchain with your desired features
  3. Several binaries, both necessary and optional, that you can run or build from source

There's definitely a lot more to Substrate, but for the purposes of this explanation we'll only cover the parts that offchain::ipfs augments. At a very high level, we modeled this implementation after the existing offchain::http module. You can look that over to get a sense of how it all works, or read on for more detail.

From here on, most of the links will be to code points within the offchain_ipfs branch of the offchain::ipfs repo.

offchain::ipfs lifecycle

  1. User runs one of the binaries, node, or node-template which launches a Substrate runtime.

  2. If the offchain worker is enabled in the configuration, a secondary runtime will start in both the full client and light client to power the IPFS node.

  3. The client will use IpfsApi and IpfsWorker to expose its functionality to any pallets that wish to access it, via ipfs_request_start and ipfs_request_wait. These calls will utilize the types explained below.

  4. If your node is configured to processes user input via a pallet, then users will make requests, typically in the form of extrinsics called via JSON-RPC. A typical successful call goes something like:

    1. The JSON-RPC server in the substrate node will recieve the call and it will be the dispatched to the relevant pallet.
    2. The pallet may store the calls in a queue to be handled later, or immediate create a valid IpfsRequest with the call argument(s).
    3. An Offchain Worker starts on each block import to handle the request to the IPFS node with its exposed APIs.
    4. When the IPFS node respond to the requests, the response is registered at the APIs as a IpfsResponse. The offchain worker stops.
    5. The response can be used to update a chain state through signed or unsigned transaction or be used in the rest of the call's logic.

Primitive Types

Offchain::ipfs adds the following types used in the substrate runtime and the offchain workers. It is useful to explore some core types as they indicate the existing offchain::ipfs functions and where to modify to add new functions to interact with IPFS.

Development Tip:

In fact, you can let the rustc do a lot of work for you! By simply adding, changing, or removing types from the following enums, you will get helpful errors telling you where else you need to change your code to satisfy the compiler.

IpfsRequest

An enum that represents a request to the IPFS node.

  • Addrs - Get the list of node's peerIds and addresses.
  • AddBytes(Vec&lt;u8&gt;) - Add the given bytes to the IPFS repo
  • AddListeningAddr(OpaqueMultiaddr) - Add an address to listen on.
  • BitswapStats - Get the bitswap stats of the node.
  • CatBytes(Vec<u8>) - Get bytes with the given Cid from the IPFS repo and display them.
  • Connect(OpaqueMultiaddr) - Connect to an external IPFS node with the specified Multiaddr.
  • Disconnect(OpaqueMultiaddr) - Disconnect from an external IPFS node with the specified Multiaddr.
  • GetBlock(Vec<u8>) - Obtain an IPFS block.
  • FindPeer(Vec<u8>) - Find the addresses related to the given PeerId.
  • GetClosestPeers(Vec<u8>) - Get a list of PeerIds closest to the given PeerId.
  • GetProviders(Vec<u8>) - Find the providers for the given Cid.
  • Identity - Get the node's public key and dedicated external addresses.
  • InsertPin(Vec<u8>, bool) - Pins a given Cid recursively or directly (non-recursively)
  • LocalAddrs - Get the list of node's local addresses.
  • LocalRefs - Get the list of Cids of blocks known to a node.
  • Peers - Obtain the list of node's peers.
  • Publish - Publish a given message to a topic.
    • topic: Vec<u8> - The topic to publish the message to.
    • message: Vec<u8> - The message to publish.
  • RemoveBlock(Vec<u8>) - Remove a block from the ipfs repo. A pinned block cannot be removed.
  • RemoveListeningAddr(OpaqueMultiaddr) - Remove an address that is listened on.
  • RemovePin(Vec<u8>, bool) - Unpins a given Cid recursively or only directly.
  • Subscribe(Vec<u8>) - Subscribe to a given topic.
  • SubscriptionList - Obtain the list of currently subscribed topics.
  • Unsubscribe(Vec<u8>) - Unsubscribe from a given topic.

IpfsResponse

An enum that represents a response from the IPFS node.

  • Addrs(Vec<(Vec<u8>, Vec<OpaqueMultiaddr>)>) - A list of pairs of node's peers and their known addresses.
  • AddBytes(Vec<u8>) - The Cid of the added bytes.
  • BitswapStats - A collection of node stats related to the bitswap protocol.
    • blocks_sent: u64 - The number of blocks sent.
    • data_sent: u64 - The number of bytes sent.
    • blocks_received: u64 - The number of blocks received.
    • data_received: u64 - The number of bytes received.
    • dup_blks_received: u64 - The number of duplicate blocks received.
    • dup_data_received: u64 - The number of duplicate bytes received.
    • peers: Vec<Vec<u8>> - The list of peers.
    • wantlist: Vec<(Vec<u8>, i32)> - The list of wanted CIDs and their bitswap priorities.
  • CatBytes(Vec<u8>) - The data received from IPFS.
  • FindPeer(Vec<OpaqueMultiaddr>) - A list of addresses known to be related to a PeerId.
  • GetClosestPeers(Vec<Vec<u8>>) - The list of PeerIds closest to the given PeerId.
  • GetProviders(Vec<Vec<u8>>) - A list of PeerIds known to provide the given Cid.
  • Identity(Vec<u8>, Vec<OpaqueMultiaddr>) - The local node's public key and the externally visible and listened to addresses.
  • LocalAddrs(Vec<OpaqueMultiaddr>) - A list of local node's externally visible and listened to addresses.
  • LocalRefs(Vec<Vec<u8>>) - A list of locally available blocks by their Cids.
  • Peers(Vec<OpaqueMultiaddr>) - The list of currently connected peers.
  • RemoveBlock(Vec<u8>) - The Cid of the removed block.
  • Success - A request was processed successfully and there is no extra value to return.

IpfsRequestStatus

An enum that represents the status of an IPFS request.

  • DeadlineReached - Deadline was reached while we waited for this request to finish.
  • IoError(Vec<u8>) - An error has occurred during the request, for example a timeout or the remote has closed our socket.
  • Invalid - The passed ID is invalid in this context.
  • Finished(IpfsResponse) - The request has finished successfully.

IpfsError

An enum that enumerates types of errors returned from the IPFS node.

  • DeadlineReached = 1 - The requested action couldn't been completed within a deadline.
  • IoError = 2 - There was an IO Error while processing the request.
  • Invalid = 3 - The ID of the request is invalid in this context.

We hope this has been a helpful overview of the core modifications that we made to Substrate in order to enable embedded IPFS functionality. If you have questions, or if you're ready to jump in and contribute, or if you're ready to build your own offchain::ipfs-powered dApp, read on.

Contributing / Development

There are several ways that people can help offchain::ipfs.

Create a dApp or a custom Substrate pallet, and tell us about it

Now that the proof-of-concept is complete and this manual is published, our goal is to be laser-focused on user needs and to extend the capabilities of offchain::ipfs based on them. So, if you end up trying this out, we would love to know how, and what your use case is.

Support offchain::ipfs financially

Equilibrium is a consultancy that builds, and helps you build on the distributed web. If you're part of an organization and want help implementing or improving offchain::ipfs, get in touch with us.

  • If you're an individual and want to support us financially, visit our OpenCollective.

Implement more IPFS functionality

Some of the functionality that exists in Rust IPFS is not exposed via offchain::ipfs, and some of the IPFS functionalities are not implemented in Rust IPFS at all. If there's something that another IPFS implementation does that you'd like offchain::ipfs to do, let us know or take a crack at implementing it yourself.