We’re big fans of Semantic Web, IPFS, Ethereum, MaidSafe, Eve, Unison, Synereo, XiA, Bitcoin, Urbit, and countless others. The trouble is, they’re all pulling in different directions, with different emphases. None offer a complete solution, on their own, to the future of software and the internet. Open exploration is admirable, but finding consensus and shared foundations is critical to gradually integrating the best ideas into a unified software ecosystem.
We believe that a standardized, minimalist, graph-oriented persistent data model is the best way to promote beneficial collaboration, competition, and cross-polination among the exploding array of decentralized internet technology projects. InfoCentral’s design proposal provides this, followed by an elegant semantic graph model derived from Semantic Web standards.
Whether proprietary or open source, whether desktop, mobile or web-based, application-centric design locks functionality and data into silos that make computing less fluid and natural. Applications are like machines with a fixed set of functions that must be taken as a whole. They cannot be easily combined with other functionality, thus limiting users expression and creativity. Those who are not professional programmers have no hope to change and adapt their behavior. Even for the best software engineers, applications (and API-oriented development) are highly fragile and have enormous maintenance costs.
The future is app-free computing – a world of modular and fully composeable software functionality that comes alongside neutral information instead of trying to partition and control it.
We admire thought-leaders like Bret Victor, Chris Granger, Paul Chiusano, who have all recognized that the way we program today (and even just use computers in general) doesn’t make sense. Current methods aren’t natural for humans, widen digital divides by making technology too difficult, and at the same time create hindrances for deep innovation in machine learning and intelligence. As Granger notes, programming needs to be direct, observable, and free of incidental complexity – not an arcane exercise in weird syntaxes, wrangling of black boxes, and removal from the problems at hand by endless layers of arcane abstraction. For programming to become learnable and thereby accessible to all, it must be possible to see all state and control flow in operation, to access all vocabulary in context, to modify behavior in place with immediate result, and to decompose, recompose, and abstract at will.
Existing projects in the area of natural UIs and programming understandably tend to first focus on human-facing languages, visualizable data structures, and related interactive modalities. While inspirational, none propose a standard, globally-scalable, graph-structured persistent data model that is capable of bridging their experiments with broader research in distributed systems. We believe that user environments are best built up from shared semantically-rich information that is designed before a single piece of code is written. Taking the infocentric approach allows everything around information to develop independently. It is insufficient to simply wrap relational or JSON document databases, leaving semantics to casually evolve through competing remotable code interfaces. Likewise, starting with a functional language or VM leads to immediate dependencies. Composition over the neutral foundation of shared semantic graph data allows for unhindered language, UI, and network research. To avoid leaky abstractions, the complexities of secure distributed interaction must be addressed from the beginning, in a platform, language, and network neutral manner. Factors around these global concerns directly affect how features needed by programmable UIs are later provided. They also determine whether the resulting systems will be machine-friendly as well.
Communication and collaboration protocols and social network services continue to proliferate wildly, even though the actual data exchanged is roughly the same. For instance, it makes no sense for there to exist dozens of popular channels for sending or publishing short pieces of text. This is the application-centric philosophy at work, which wraps together data, presentation, and quality of service. In doing so, it creates ever more functionality silos.
InfoCentral standardizes all of the data artifacts around communication, collaboration, and social computing, separating them from services and software. It then allows networks to evolve around the open data, providing for varying quality-of-service needs. Composeable software, under full local control of users, can then use and adapt this shared communications data toward unlimited varieties of interactions and user experiences.
Today, there is no valid reason for personal and private business information to be scattered across dozens of isolated filesystems, databases, storage mediums, devices, public and private internet services, and web applications. This is simply an artifact of the past era of computing, where devices and softwares were largely designed as standalone “appliances” that didn’t need to interact to one another – forcing the user to do all the work.
We believe that all information should be integrated, all the time, without artificial boundaries. Users shouldn’t have to worry about manually moving data around or wrestling it into different formats for different uses. Information should never be trapped at service or application boundaries. And it should be trivial to ensure that all of one’s information is stored redundantly.
It’s impossible to create a consistent and unified view of the world. The world is eventually-consistent and so is all real-world information. Truth exists at the edges of networks and propagates from there. Disagreements also arise and propagate. Social trust varies over time. Decisions must be made with incomplete and sometimes conflicting information.
The only valid solution is to accept that information will be multi-sourced, but make it easily layerable. To do this requires stability of reference, so that compositions and annotations can be built across potentially-antagonisitic datasets. This is a primary motivation for our exclusive use of hash-based data referencing.
One of the primary challenges of the Semantic Web effort has been the creation of useful ontologies. It is enormously difficult to achieve global, cross-cultural standardization of common concepts, with parallels seen in natural language translation. We do not believe this is an unsolveable problem, however. In fact, recent deep learning translation research successes may point the way. Instead of starting by manually designing ontologies, machine generated concept maps can be used to seed the process of collaborative ontology development. InfoCentral’s proposal for stable hash-based identities of concept nodes, along with layering and context discovery, make this feasible as a globally-scalable, evolvable solution.
Like most distributed internet projects, InfoCentral promotes users’ control of their own information, with flexible control of data visibility through ubiquitous cryptography and reliable attribution through signing. InfoCentral promotes direct network service models over the user-surveillance and forced-advertising models relied upon by nearly all proprietary websites and apps. Unlike other projects, however, InfoCentral does not propose that everyone should use the same distributed network model. (ex. It has no dependencies on blockchains or DHTs.) By standardizing information first, users are free to switch among networks and software at will.
Interaction Patterns are declarative contracts for how shared graph data is used to accomplish things like sending a message, collaboratively editing a document, engaging in a threaded discussion, conducting a secret ballot, bidding on an auction, making a reservation, or playing a game. Today, all of these sorts of interactions would need specialized software, like mobile apps. In the InfoCentral model, users can just grab a pattern, share it with participants, and start interacting.
Any user and system can operate over the global data graph. There is no need for custom web services as coordination points. Any sharable data repository will do. Services are simply automated participants in Interaction Patterns. They might be a local centralized trusted system. They might be a Distributed Autonomous Organization living in a blockchain network.
One size never fits all. The InfoCentral model promotes diverse public and private networks that can be woven together seamlessly thanks to layerable data and reliable referencing.
While it’s hard to get people to agree, it’s even harder today to get people to talk constructively. Layered, hash-referenced information allows many participants to engage one another without censorship on any side. It ensures reliable conversation history and the ability to contextualize, cross-reference, annotate, and revise discussion points over time. With engaged communities, the best information and arguments can rise to the top, even amidst lack of true consensus. It almost goes without saying that such tools will also be a boon to communities already accustomed to civil discourse, like academic and scientific research.
The Internet of Things is a great idea without the proper infrastructure to support it. Current solutions are embarassingly clumsy, insecure, inflexible, and unreliable. Consider the ridiculousness of a home thermostat or lighting appliance that must connect to a central web server just to set an integer value or an array of datetime-value tuples – all through an opaque proprietary interface that can only talk to a special mobile app. Such solutions are nowhere close to the promise of universal interoperability that has defined Pervasive Computing research.
The semantic graph data standardization that InfoCentral proposes is the ideal universal interface for composing tomorrows Pervasive Computing environments, bringing IoT devices into genuinely integrated meshes of personal and commercial functionality.
InfoCentral is a next-generation internet engineering project and proposal. It combines Information-Centric Networking, persistent graph data models, declarative programming, and the best elements of the Semantic Web into a new software and internet architecture – one that is fundamentally decentralized and distributable, while also easier to secure.
An information-centric internet and software ecosystem is fundamentally more composeable, contextual, and collaborative. Apps and sites are replaced by a fully integrated information environment and personalizable workspaces. The user is free to layer and adapt information and software to their needs, whether that user is human or AI.
InfoCentral has exciting practical applications for early adopters. However, it ultimately designs for a future driven by practical forms of artificial intelligence, more collaborative social and economic patterns, and an expectation of universal technology interoperability.
Current software and internet architectures no longer properly support our ambitions. The InfoCentral proposal comprises a vision and set of principles to create clean-slate, future-proof open standards for information management, software engineering, and Internet communication. While InfoCentral builds upon academic research, it is a practical engineering project intent on real-world results.
Within the InfoCentral data model, entities are exclusively referencable using cryptographically-secure hash values. Unlike URIs, hash IDs never go stale. They are mathematically linked to the data they reference, making them as reliable as the hash algorithm. InfoCentral designs take into account the need to migrate to stronger algorithms over time, while also mitigating the impact of discovered weaknesses. (ex. multi-hash references, secure nonces, size and other reference metadata, strict schema validations, etc.)
Human-meaningful naming necessarily creates mutable pointers. These are strictly disallowed by the data model because they are inherently unstable and not conducive to decentralized collaborative information. While arbitrary object name metadata is supported at the UI level, memorable identifiers comparable to DNS and file paths are a false requirement based on legacy designs. There is no need to remember and input arbitrary names and addresses in a properly designed information environment. Likewise, AI has no use for human naming but does require the mathematical reliability that only hash-based identities can provide.
Global, reliable dereferencing is historically unrealistic in practice, even before considering the need for permanent, flat data identity. Current approaches are costly and fragile. Going forward, the best approach is to support modularity. Network innovation must be unhindered, so that economics and popularity can drive QoS. Many networks and contained information overlays will also be private. The InfoCentral proposal has no expectation of a single global DHT, blockchain, or similar structure, though such approaches may be useful to spread lightweight information about available networks and to serve as a bootstrapping mechanism.
We wholesale reject hierarchical naming and resolution schemes (ie. two-phase) in which data identity is inseparably conflated with a network-specific locator component – even if it is PKI/hash-based. However, for the internal management of data exchange, networks may use any suitable packet identification, routing and metadata schemes. These are invisible and orthogonal to the persistent data model, which is entirely portable between systems and networks.
Information-centric networks make data directly addressable and routable, abstracting most or all aspects of physical networks and storage systems. This causes data itself to become independent of the artifacts that support its physical existence, effectively removing the distinction between local and global resources. Users and high-level software are thus liberated from worrying about these artifacts and may treat all data as if it were local. A request for a data entity by its hash ID returns its contents, without knowledge of where it came from or how it was retrieved.
Unlike some related projects, InfoCentral intentionally does not specify a single, particular networking scheme. One-size-fits-all network designs are economically detrimental. Redundancy and performance needs vary greatly and often cannot be predicted. Many host-based and content-based networks can be used to transparently back InfoCentral-style repositories, each bringing their own unique economics and QoS parameters. Meanwhile, information itself has permanence while the networks and software around it evolve.
Networks of the future will be smarter, with content-awareness often driving replication. Constellations of linked, related, and adjacently-accessed information will tend to become clustered near locations where popularity is high. Service of subscriptions and interest registrations will likewise play a large role in shaping data flows.
In any system founded upon immutable data structures, an out-of-band mechanism must provide a means to aggregate or notify of new data over time. Having rejected mutable pointers, InfoCentral instead uses reference metadata collections to gather awareness of new data around existing data. References metadata is simply data about what other entities reference a given entity (and potentially why). For example, a new revision references a previous revision or revision collection root. Upon creation, knowledge of its existence will be propagated to interested users.
Any given reference metadata collection is inherently partial knowledge about globally existent references to an entity. All nodes have their own collections per entity. The means of management are left unspecified because there are many possible schemes of propagation across and between varied networking schemes. Again, this allows for endless specialization without changing the data model – from state-based CRDTs to even fully synchronous replication among federated repositories.
Metadata collections allow for unlimited layering of information from unlimited sources. It is up to data consumers to decide which metadata is useful, for example based on type, timestamp, or signatures from trusted parties. Networks may also have rules about what metadata references they are willing to collect and/or they may provide query capabilities for clients.
Structuring information as a persistent graph is the only method that allows unlimited, global-scale, coordination-free composition and collaboration. Persistent graphs are even more powerful for data than hyperlinks were for the web of HTML pages. They allow precise 3rd party references that cannot break later, so long as the referenced entity exists somewhere in the world. The exclusive use of hash-based references means that data entities natively form a directed acyclic graph. With metadata reference collection, however, this becomes a bidirectional graph in the local scope. (similar to web search engine “referenced by” indexing)
All higher-level data structures built upon the persistent data model may take advantage of basic graph semantics. Semantic Web data is an obvious native fit, but all forms of personal and business data will be able to take advantage of the features that the graph data model provides, such as default versioning and annotation capabilities.
Programming models where code owns mutable data are incredibly fragile and the source of most software problems today. Code and data must become orthogonal so that re-use is not hindered. Code may be applied to operate upon data and produce new data, but may not own data or change what already exists. This is a sharp departure from mainstay Object Oriented Programming and it requires a complete paradigm shift in thinking and development methodology. Fortunately, functional programming research has already paved the way to this future. It is the natural fit for the persistent graph data model we envision, in combination with other declarative models, of which functional is a branch.
Declarative code is the most effective route toward widespread parallelization. As processor core count continues to grow exponentially, this will quickly become non-negotiable. Declarative code is also the shortest path to verifiably secure systems and is the easiest for AI to reason about. Likewise, flow of data and control can be easily visualized and analyzed in a working system.
The graph of immutable entities is the universal software interface. Users, whether human or machine, interact solely by adding new entities that reference existing entities. Patterns of doing so are captured by declarative code, enabling standardization of useful interactions without the data encapsulation and dependency-creation of traditional APIs. Many Interaction Patterns can be used over the same open public data graph. Thanks to the elimination of shared writable objects through data entity immutability, users’ interactions cannot interfere with one another. This allows unlimited public and private overlays without needing permission or coordination. There is likewise no need to sandbox code, rather we may designate read access policies. Like patterns themselves, these policies can be collaboratively developed and trusted.
Modern software design usually starts with human-oriented user stories, often focused on views, and is dictated by a hierarchy of functionality designed to support these. This is incompatible with creating systems that are natively useful to AI. It is also incompatible with creating fully integrated information environments for humans, the ultimate realization of which is Pervasive Computing.
Pattern-driven graph interactions form the foundation upon which all higher-level user stories are realized. By reducing interaction to declared abilities and intentions, all human UI modalities can be automatically generated. Preferences and the user’s environment may be taken into account automatically.
Access controls are notoriously difficult to perfectly enforce. They also result in data being bound to particular systems and harder to securely and reliably back up. While cryptography is no panacea, it can at least consolidate security practices to a manageable number.
In the modern world, the architecture of information and the technology surrounding it dramatically influences how people interact with both technology and each other. As with public infrastructure, changes to IT architecture often produce massive downstream social changes. There should therefore be a great sense of responsibility when engineers design information systems.
We believe that the communication aspects InfoCentral focuses on can improve society in the areas of collaboration, contextual clarity, community-building, and civility. Education, healthcare, government, commerce, media, the arts, religion, and interpersonal relations can all benefit from these improvements.
Because InfoCentral is a multi-disciplinary effort, it aims to draw a diverse community of participants. As an open source project, it will involve many developers. As a practical application of research, it has connections to academia. As a tool for social progress, it requires involvement with the public and non-profit sectors. As a platform for development and commerce, it is of interest to entrepreneurs and business leaders.
InfoCentral has two primary operational arenas: core architecture and practical applications. The core architecture division is responsible for all low-level design and reference implementation of the data management and information environment standards. Numerous application teams focus on building generic modules and support necessary to enable particular end-user interactions and use cases. These may include crowd-sourced efforts, industry-specific consortiums, consultants, etc. Application teams build infrastructure, not “applications” in the software lingo sense. Because infrastructure is shared, cross-team collaboration should be the norm. The goal is that as little code as possible should be dedicated to meeting particular end-user needs.
You may contact project lead by emailing him at his first name at the domain of this website.
Copyright 2017, by Chris Gebhardt.