We’re big fans of Semantic Web, IPFS, Ethereum, MaidSafe, Eve, Unison, Synereo, XiA, Bitcoin, Urbit, and countless others. The trouble is, they’re all pulling in different directions, with different emphases. Each are largely their own isolated worlds. None offer a complete solution, on their own, to the future of software and the internet. Open exploration is awesome, but finding consensus and shared foundations is critical to gradually integrating the best ideas into a unified software ecosystem.
We believe that a minimalist standard persistent data model is the best way to promote beneficial collaboration, competition, and cross-polination among the exploding array of decentralized internet technology projects. InfoCentral’s design proposal provides this, followed by an elegant semantic graph model derived from Semantic Web standards.
Whether proprietary or open source, whether desktop, mobile or web-based, application-centric design locks functionality and data into silos that make computing less fluid and natural. Applications are like machines with a fixed set of functions that must be taken as a whole. They cannot be easily combined with other functionality, thus limiting users expression and creativity. Those who are not professional programmers have no hope to change and adapt their behavior. Even for the best software engineers, applications (and API-oriented development) are highly fragile and have enormous maintenance costs.
The future is app-free computing – a world of modular and fully composeable software functionality that comes alongside neutral information instead of trying to partition and control it.
We admire thought-leaders like Bret Victor, Chris Granger, Paul Chiusano, who have all recognized that the way we program today (and even just use computers in general) doesn’t make sense. Current methods aren’t natural for humans, widen digital divides by making technology too difficult, and at the same time create hindrances for deep innovation in machine learning and intelligence. As Granger notes, programming needs to be direct, observable, and free of incidental complexity – not an arcane exercise in weird syntaxes, wrangling of black boxes, and removal from the problems at hand by endless layers of arcane abstraction. For programming to become learnable and thereby accessible to all, it must be possible to see all state and control flow in operation, to access all vocabulary in context, to modify behavior in place with immediate result, and to decompose, recompose, and abstract at will.
Existing projects in the area of natural UIs and graphical programming understandably tend to focus on languages, data structures, and related interactive modalities first. While inspirational, they lack the backing of globally-standardized graph-structured persistent data that is capable of outlasting their experiments. We believe that UIs are best built up from semantically-rich information that is designed before a single piece of code is written. It is insufficient to wrap simple relational or JSON document databases and call it a day. The complexities of multi-party distributed interaction across globally shared data must be addressed from the beginning.
Communication and collaboration protocols and services continue to proliferate wildly, even though the actual data exchanged is roughly the same. For instance, it makes no sense for there to exist dozens of popular channels for sending or publishing short pieces of text. This is the application-centric philosophy at work, which wraps together data, presentation, and quality of service. In doing so, it creates even more functionality silos.
InfoCentral standardizes all of the data artifacts around communication and collaboration, separating them from services and software. It then allows networks to be evolve around the open data, providing for varying quality of service needs. Software, under full local control of users, can then use and adapt the communications data as needed.
Today, there is no valid reason for personal and private business information to be scattered across dozens of isolated filesystems, databases, storage mediums, devices, public and private internet services, and web applications. This is simply an artifact of the past era of computing, where devices and softwares were largely designed as standalone “appliances” that didn’t need to interact to one another – forcing the user to do all the work.
We believe that all information should be integrated, all the time, without artificial boundaries. Users shouldn’t have to worry about manually moving data around or wrestling it into different formats for different uses. Information should never be trapped at service or application boundaries. And it should be trivial to ensure that all of one’s information is stored redundantly.
It’s impossible to create a consistent and unified view of the world. The world is eventually-consistent and so is all real-world information. Truth exists at the edges of networks and propagates from there. Disagreements also arise and propagate. Social trust varies over time. Decisions must be made with incomplete and sometimes conflicting information.
The only valid solution is to accept that information will be multi-sourced, but make it easily layerable. To do this requires stability of reference, so that compositions and annotations can be built across potentially-antagonisitic datasets. This is a primary motivation for exclusive use of hash-based data referencing.
Like most distributed internet projects, InfoCentral promotes users’ control of their own information, with flexible control of data visibility through ubiquitous cryptography and reliable attribution through signing. InfoCentral promotes direct service models over the user-surveillance and forced-advertising models relied upon by nearly all proprietary websites and apps.
Interaction patterns are declarative contracts for how shared graph data is used to accomplish things like sending a message, getting in line at a restaurant, conducting a secret ballot, bidding on an auction, collaboratively editing a document, or playing a game. Today, all of these sorts of interactions would need specialized software. In the InfoCentral model, users can just grab a pattern, share it with participants, and start interacting.
Anybody and any system can operate over the global data graph. There is no need for custom web services as coordination points. Any sharable repository will do. Services are simply automated participants in interaction patterns. They might be a single trusted system. They might be a Distributed Autonomous Organization living in a blockchain network.
One size never fits all. Create both public and private networks that can be woven together seamlessly thanks to layerable data.
While it’s hard to get people to agree, it’s even harder today to get them talking constructively. Layered, hash-referenced information allows many participants to engage one another without censorship on any side. It ensures reliable conversation history and the ability to contextualize, cross-reference, annotate, and revise discussion points over time. With engaged communities, the best information and arguments can rise to the top, even amidst lack of true consensus. It almost goes without saying that such tools will also be a boon to communities already accustomed to civil discourse, like academic and scientific research.
InfoCentral is a next-generation internet engineering project and proposal. It combines Information-Centric Networking, persistent graph data models, functional programming, and core elements of the Semantic Web into a new software and internet architecture – one that is fundamentally decentralized and distributable, while also easier to secure.
An information-centric internet is fundamentally more composeable, contextual, and collaborative. Apps and sites are replaced by a fully integrated information environment and personalizable workspaces. The user is entirely free to adapt information and software to their needs, whether that user is human or AI.
InfoCentral has exciting practical applications for early adopters. However, it ultimately designs for a future that will increasingly be driven by forms of artificial intelligence, more collaborative social and economic patterns, and an expectation of universal technology interoperability.
Current software and internet architectures no longer properly support our ambitions. The InfoCentral proposal comprises a vision and set of principles to create clean-slate, future-proof open standards for information management, software engineering, and Internet communication. While InfoCentral builds upon academic research, it is a practical engineering project intent on real-world results.
Within the InfoCentral data model, entities are exclusively referencable using cryptographically-secure hash values. Unlike URIs, hash IDs never go stale. They are mathematically linked to the data they reference, making them as reliable as the hash algorithm. InfoCentral designs take into account the need to migrate to stronger algorithms over time, while also mitigating the impact of discovered weaknesses. (ex. multi-hash references, secure nonces, size and other reference metadata, strict schema validations, etc.)
Human-meaningful naming necessarily creates mutable pointers. These are strictly disallowed by the data model because they are inherently unstable and not conducive to decentralized collaborative information. While arbitrary object name metadata is supported at the UI level, memorable identifiers comparable to DNS and file paths are a false requirement based on legacy designs. There is no need to remember and input arbitrary names and addresses in a properly designed information environment. Likewise, AI has no use for human naming but does require the mathematical reliability that only hash-based identities can provide.
Global, reliable dereferencing is historically unrealistic in practice, even before considering the need for permanent, flat data identity. Current approaches are costly and fragile. Going forward, the best approach is to support modularity. Network innovation must be unhindered, so that economics and popularity can drive QoS. Many networks and contained information overlays will also be private. The InfoCentral proposal has no expectation of a single global DHT, blockchain, or similar structure, though such approaches may be useful to spread lightweight information about available networks and to serve as a bootstrapping mechanism.
We wholesale reject hierarchical naming and resolution schemes (ie. two-phase) in which data identity is inseparably conflated with a network-specific locator component – even if it is PKI/hash-based. However, for the internal management of data exchange, networks may use any suitable packet identification, routing and metadata schemes. These are invisible and orthogonal to the persistent data model, which is entirely portable between systems and networks.
Information-centric networks make data directly addressable and routable, abstracting most or all aspects of physical networks and storage systems. This causes data itself to become independent of the artifacts that support its physical existence, effectively removing the distinction between local and global resources. Users and high-level software are thus liberated from worrying about these artifacts and may treat all data as if it were local. A request for a data entity by its hash ID returns its contents, without knowledge of where it came from or how it was retrieved.
Unlike some related projects, InfoCentral intentionally does not specify a single, particular networking scheme. One-size-fits-all network designs are economically detrimental. Redundancy and performance needs vary greatly and often cannot be predicted. Many host-based and content-based networks can be used to transparently back InfoCentral-style repositories, each bringing their own unique economics and QoS parameters. Meanwhile, information itself has permanence while the networks and software around it evolve.
Networks of the future will be smarter, with content-awareness often driving replication. Constellations of linked, related, and adjacently-accessed information will tend to become clustered near locations where popularity is high. Service of subscriptions and interest registrations will likewise play a large role in shaping data flows.
In any system founded upon immutable data structures, an out-of-band mechanism must provide a means to aggregate or point to new data over time. Having rejected mutable pointers, InfoCentral instead uses reference metadata collections to gather awareness of new data around existing data. References metadata is simply data about what other entities reference a given entity (and potentially why). For example, a new revision references a previous revision or revision collection root. Upon creation, knowledge of its existence will be propagated to interested users.
Any given reference metadata collection is inherently partial knowledge about globally existent references to an entity. All nodes have their own collections per entity. The means of management are left unspecified because there are many possible schemes of propagation across and between varied networking schemes. Again, this allows for endless specialization without changing the data model – even fully synchronous replication among federated repositories.
Metadata collections allow for unlimited layering of information from unlimited sources. It is up to data consumers to decide which metadata is useful, for example based on type, timestamp, or signatures from trusted parties. Networks may also have rules about what metadata references they are willing to collect and/or they may provide query capabilities for clients.
Structuring information as a persistent graph is the only method that allows unlimited, global-scale, coordination-free composition and collaboration. Persistent graphs are even more powerful for data than hyperlinks were for the web of HTML pages. They allow precise 3rd party references that cannot break later, so long as the referenced entity exists somewhere in the world. The exclusive use of hash-based references means that data entities natively form a directed graph. With metadata reference collection, this becomes a bidirectional graph in the local scope. (similar to web search engine “referenced by” indexing)
All higher-level data structures built upon the persistent data model may take advantage of basic graph semantics. Semantic Web data is an obvious native fit, but all forms of personal and business data will be able to take advantage of the features that the graph data model provides, such as default versioning and annotation capabilities.
Programming models where code owns mutable data are incredibly fragile and the source of most software problems today. Code and data must become orthogonal so that re-use is not hindered. Code may be applied to operate upon data and produce new data, but may not own data or change what already exists. This is a sharp departure from mainstay Object Oriented Programming and it requires a complete paradigm shift in thinking and development methodology. Fortunately, functional programming research has already paved the way to this future. It is the natural fit for the persistent graph data model we envision, in combination with other declarative models, of which functional is a branch.
Declarative code is the default route toward parallelization. As processor core count continues to grow exponentially, this will quickly become non-negotiable. Declarative code is also the shortest path to verifiably secure systems and is the easiest for AI to reason about. And declarative code itself also fits the data graph perfectly. Flow of data and control can be easily visualized and analyzed in a working system.
The graph of immutable entities is the universal software interface. Users, whether human or machine, interact solely by adding new entities that reference existing entities. Patterns of doing so are captured by declarative code, enabling standardization of useful interactions without the data encapsulation and dependency-creation of traditional APIs. Many patterns can be used over the same open public data graph. Thanks to the elimination of shared writable objects through immutability, users’ interactions cannot interfere with one another. This allows unlimited public and private overlays without needing permission or coordination. There is likewise no need to sandbox code, rather to designate read access policies. Like patterns themselves, policies can be collaboratively developed and trusted.
Modern software design usually starts with human-oriented user stories, often focused on views, and is dictated by a hierarchy of functionality designed to support these. This is incompatible with creating systems that are natively useful to AI. It is also incompatible with creating fully integrated information environments for humans, the ultimate realization of which is Pervasive Computing.
Pattern-driven graph interactions form the foundation upon which all higher-level user stories are realized. By reducing interaction to declared abilities and intentions, all human UI modalities can be automatically generated. Preferences and the user’s environment may be taken into account automatically.
Access controls are notoriously difficult to perfectly enforce. They also result in data being bound to particular systems and harder to securely and reliably back up. While cryptography is no panacea, it can at least consolidate security practices to a manageable number.
In the modern world, the architecture of information and the technology surrounding it dramatically influences how people interact with both technology and each other. As with public infrastructure, changes to IT architecture often produce massive downstream social changes. There should therefore be a great sense of responsibility when engineers design information systems.
We believe that the communication aspects InfoCentral focuses on can improve society in the areas of collaboration, contextual clarity, community-building, and civility. Education, healthcare, government, commerce, media, the arts, religion, and interpersonal relations can all benefit from these improvements.
Because InfoCentral is a multi-disciplinary effort, it aims to draw a diverse community of participants. As an open source project, it will involve many developers. As a practical application of research, it has connections to academia. As a tool for social progress, it requires involvement with the public and non-profit sectors. As a platform for development and commerce, it is of interest to entrepreneurs and business leaders.
InfoCentral has two primary operational arenas: core architecture and practical applications. The core architecture division is responsible for all low-level design and reference implementation of the data management and information environment standards. Numerous application teams focus on building generic modules and support necessary to enable particular end-user interactions and use cases. These may include crowd-sourced efforts, industry-specific consortiums, consultants, etc. Application teams build infrastructure, not “applications” in the software lingo sense. Because infrastructure is shared, cross-team collaboration should be the norm. The goal is that as little code as possible should be dedicated to meeting particular end-user needs.
You may contact project lead by emailing him at his first name at the domain of this website.
Copyright 2017, by Chris Gebhardt.