All Graph Transformers NeuroAI Reinforcement Learning NLP

Selected Publications

GraphFM Publication Image

GraphFM: A Scalable Framework for Multi-Graph Pretraining

Graph Learning Transformers

Divyansha Lachi, Mehdi Azabou, Vinam Arora, and Eva Dyer

Preprint, 2024

Abstract: Graph neural networks are typically trained on individual datasets, often requiring highly specialized models and extensive hyperparameter tuning. This dataset-specific approach arises because each graph dataset often has unique node features and diverse connectivity structures, making it difficult to build a generalist model. To address these challenges, we introduce a scalable multi-graph multi-task pretraining approach specifically tailored for node classification tasks across diverse graph datasets from different domains. Our method, Graph Foundation Model (GraphFM), leverages a Perceiver-based encoder that employs learned latent tokens to compress domain-specific features into a common latent space. This approach enhances the model's ability to generalize across different graphs and allows for scaling across diverse data. We demonstrate the efficacy of our approach by training a model on 152 different graph datasets comprising over 7.4 million nodes and 189 million edges, establishing the first set of scaling laws for multi-graph pretraining on datasets spanning many domains (e.g., molecules, citation and product graphs). Our results show that pretraining on a diverse array of real and synthetic graphs improves the model's adaptability and stability, while performing competitively with state-of-the-art specialist models. This work illustrates that multi-graph pretraining can significantly reduce the burden imposed by the current graph training paradigm, unlocking new capabilities for the field of graph neural networks by creating a single generalist model that performs competitively across a wide range of datasets and tasks.

Leveraging Perceiver IO and Relative Position Encodings for Enhanced Node Classification

Graph Learning Transformers

Divyansha Lachi, Vinam Arora, Mehdi Azabou, and Eva Dyer

SIAM Conference on Mathematics of Data Science (MDS24), Mini-symposium: New Frontier of Graph Machine Learning, 2024

Abstract: Graph transformer models have been limited in their application to large-scale graphs due to the quadratic computational complexity inherent to their attention mechanisms. This often results in a trade-off between scalability and model performance. In this work, we introduce a novel graph transformer architecture inspired by the PerceiverIO framework, which utilizes a combination of latent compression and relative position encoding to efficiently scale graph attention to larger graphs. Our approach mitigates the quadratic cost of pairwise communication between all nodes in a graph by learning a set of latent tokens through which nodes exchange messages. We evaluate our model on several node classification benchmarks and demonstrate that it not only surpasses existing graph transformer models in terms of performance but also maintains high efficiency and adaptability across different graph structures. Overall, our architecture offers a scalable solution for efficiently processing large graph datasets while significantly improving model performance and generalization.

Stochastic Wiring of Cell Types Enhances Fitness by Generating Phenotypic Variability

NeuroAI Reinforcement Learning

Divyansha Lachi, Ann Huang, Augustine N Mavor-Parker, Arna Ghosh, Blake Richards, and Anthony Zador

Preprint, 2024

Abstract: The development of neural connectivity is a crucial biological process that gives rise to diverse brain circuits and behaviors. Neural development is a stochastic process, but this stochasticity is often treated as a nuisance to overcome rather than as a functional advantage. Here we use a computational model, in which connection probabilities between discrete cell types are genetically specified, to investigate the benefits of stochasticity in the development of neural wiring. We show that this model can be viewed as a generalization of a powerful class of artificial neural networks—Bayesian neural networks—where each network parameter is a sample from a distribution. Our results reveal that stochasticity confers a greater benefit in large networks and variable environments, which may explain its role in organisms with larger brains. Surprisingly, we find that the average fitness over a population of agents is higher than a single agent defined by the average connection probability. Our model reveals how developmental stochasticity, by inducing a form of non-heritable phenotypic variability, can increase the probability that at least some individuals will survive in rapidly changing, unpredictable environments. Our results suggest how stochasticity may be an important feature rather than a bug in neural development.

Encoding innate ability through a genomic bottleneck

NeuroAI Reinforcement Learning

Sergey Shuvaev, Divyansha Lachi, Alexei Koulakov, and Anthony Zador

PNAS, 2024

Abstract: Animals are born with extensive innate behavioral capabilities, which arise from neural circuits encoded in the genome. However, the information capacity of the genome is orders of magnitude smaller than that needed to specify the connectivity of an arbitrary brain circuit, indicating that the rules encoding circuit formation must fit through a “genomic bottleneck” as they pass from one generation to the next. Here we formulate the problem of innate behavioral capacity in the context of artificial neural networks in terms of lossy compression of the weight matrix. We find that several standard network architectures can be compressed by several orders of magnitude, yielding pre-training performance that can approach that of the fully-trained network. Interestingly, for complex but not for simple test problems, the genomic bottleneck algorithm also captures essential features of the circuit, leading to enhanced transfer learning to novel tasks and datasets. Our results suggest that compressing a neural circuit through the genomic bottleneck serves as a regularizer, enabling evolution to select simple circuits that can be readily adapted to important real-world tasks. The genomic bottleneck also suggests how innate priors can complement conventional approaches to learning in designing algorithms for artificial intelligence.

Debunking Fake News by Leveraging Speaker Credibility and BERT Based Models

NLP

Thoudam Doren Singh, Divyansha, Apoorva Vikram Singh, Anubhav Sachan and Abdullah Faiz Ur Rahman Khilji

IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT), 2020

Abstract: The exponential growth in fake news and its role in deteriorating general public trust and democratic standards certainly calls for some counter combat approaches. The prediction of chances of news to be fake is deemed to be hard task since most of the deceptive news has its roots in true news. With a minor fabrication in legitimate news, influential fake news can be created that can be used for political, entertainment, or business-related gains. This work provides a novel intuitive approach to exploit data from multiple sources to segregate news into real and fake. To efficiently capture the contextual information present in the data, Bidirectional Encoder Representations from Transformer (BERT) have been deployed. It attempts to further enhance the performance of the deceptive news detection model by incorporating information about the speaker profile and the credibility associated with him/her. A hybrid sequence encoding model has been proposed to harvest the speaker profile and speaker credibility data which makes it useful for prediction. On evaluation over benchmark fake news dataset LIAR, our model outperformed the previous state-of-the-art works. This attests to the fact that the speaker’s profile and credibility play a crucial role in predicting the validity of news.