Divyansha Lachi

Selected Publications

GraphFM: A Scalable Framework for Multi-Graph Pretraining

Graph Learning Transformers

Divyansha Lachi, Mehdi Azabou, Vinam Arora, and Eva Dyer

Preprint, 2024

arXiv

Abstract: Graph neural networks are typically trained on individual datasets, often requiring highly specialized models and extensive hyperparameter tuning. This dataset-specific approach arises because each graph dataset often has unique node features and diverse connectivity structures, making it difficult to build a generalist model. To address these challenges, we introduce a scalable multi-graph multi-task pretraining approach specifically tailored for node classification tasks across diverse graph datasets from different domains. Our method, Graph Foundation Model (GraphFM), leverages a Perceiver-based encoder that employs learned latent tokens to compress domain-specific features into a common latent space. This approach enhances the model's ability to generalize across different graphs and allows for scaling across diverse data. We demonstrate the efficacy of our approach by training a model on 152 different graph datasets comprising over 7.4 million nodes and 189 million edges, establishing the first set of scaling laws for multi-graph pretraining on datasets spanning many domains (e.g., molecules, citation and product graphs). Our results show that pretraining on a diverse array of real and synthetic graphs improves the model's adaptability and stability, while performing competitively with state-of-the-art specialist models. This work illustrates that multi-graph pretraining can significantly reduce the burden imposed by the current graph training paradigm, unlocking new capabilities for the field of graph neural networks by creating a single generalist model that performs competitively across a wide range of datasets and tasks.

Leveraging Perceiver IO and Relative Position Encodings for Enhanced Node Classification

Graph Learning Transformers

Divyansha Lachi, Vinam Arora, Mehdi Azabou, and Eva Dyer

SIAM Conference on Mathematics of Data Science (MDS24), Mini-symposium: New Frontier of Graph Machine Learning, 2024

Abstract: Graph transformer models have been limited in their application to large-scale graphs due to the quadratic computational complexity inherent to their attention mechanisms. This often results in a trade-off between scalability and model performance. In this work, we introduce a novel graph transformer architecture inspired by the PerceiverIO framework, which utilizes a combination of latent compression and relative position encoding to efficiently scale graph attention to larger graphs. Our approach mitigates the quadratic cost of pairwise communication between all nodes in a graph by learning a set of latent tokens through which nodes exchange messages. We evaluate our model on several node classification benchmarks and demonstrate that it not only surpasses existing graph transformer models in terms of performance but also maintains high efficiency and adaptability across different graph structures. Overall, our architecture offers a scalable solution for efficiently processing large graph datasets while significantly improving model performance and generalization.

Stochastic Wiring of Cell Types Enhances Fitness by Generating Phenotypic Variability

NeuroAI Reinforcement Learning

Divyansha Lachi, Ann Huang, Augustine N Mavor-Parker, Arna Ghosh, Blake Richards, and Anthony Zador

Preprint, 2024

bioRxiv

Abstract: The development of neural connectivity is a crucial biological process that gives rise to diverse brain circuits and behaviors. Neural development is a stochastic process, but this stochasticity is often treated as a nuisance to overcome rather than as a functional advantage. Here we use a computational model, in which connection probabilities between discrete cell types are genetically specified, to investigate the benefits of stochasticity in the development of neural wiring. We show that this model can be viewed as a generalization of a powerful class of artificial neural networks—Bayesian neural networks—where each network parameter is a sample from a distribution. Our results reveal that stochasticity confers a greater benefit in large networks and variable environments, which may explain its role in organisms with larger brains. Surprisingly, we find that the average fitness over a population of agents is higher than a single agent defined by the average connection probability. Our model reveals how developmental stochasticity, by inducing a form of non-heritable phenotypic variability, can increase the probability that at least some individuals will survive in rapidly changing, unpredictable environments. Our results suggest how stochasticity may be an important feature rather than a bug in neural development.

Encoding innate ability through a genomic bottleneck

NeuroAI Reinforcement Learning

Sergey Shuvaev, Divyansha Lachi, Alexei Koulakov, and Anthony Zador

PNAS, 2024

PNAS bioRxiv

Abstract: Animals are born with extensive innate behavioral capabilities, which arise from neural circuits encoded in the genome. However, the information capacity of the genome is orders of magnitude smaller than that needed to specify the connectivity of an arbitrary brain circuit, indicating that the rules encoding circuit formation must fit through a “genomic bottleneck” as they pass from one generation to the next. Here we formulate the problem of innate behavioral capacity in the context of artificial neural networks in terms of lossy compression of the weight matrix. We find that several standard network architectures can be compressed by several orders of magnitude, yielding pre-training performance that can approach that of the fully-trained network. Interestingly, for complex but not for simple test problems, the genomic bottleneck algorithm also captures essential features of the circuit, leading to enhanced transfer learning to novel tasks and datasets. Our results suggest that compressing a neural circuit through the genomic bottleneck serves as a regularizer, enabling evolution to select simple circuits that can be readily adapted to important real-world tasks. The genomic bottleneck also suggests how innate priors can complement conventional approaches to learning in designing algorithms for artificial intelligence.

Debunking Fake News by Leveraging Speaker Credibility and BERT Based Models

NLP

Thoudam Doren Singh, Divyansha, Apoorva Vikram Singh, Anubhav Sachan and Abdullah Faiz Ur Rahman Khilji

IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT), 2020

Code

Abstract: The exponential growth in fake news and its role in deteriorating general public trust and democratic standards certainly calls for some counter combat approaches. The prediction of chances of news to be fake is deemed to be hard task since most of the deceptive news has its roots in true news. With a minor fabrication in legitimate news, influential fake news can be created that can be used for political, entertainment, or business-related gains. This work provides a novel intuitive approach to exploit data from multiple sources to segregate news into real and fake. To efficiently capture the contextual information present in the data, Bidirectional Encoder Representations from Transformer (BERT) have been deployed. It attempts to further enhance the performance of the deceptive news detection model by incorporating information about the speaker profile and the credibility associated with him/her. A hybrid sequence encoding model has been proposed to harvest the speaker profile and speaker credibility data which makes it useful for prediction. On evaluation over benchmark fake news dataset LIAR, our model outperformed the previous state-of-the-art works. This attests to the fact that the speaker’s profile and credibility play a crucial role in predicting the validity of news.

GraphFM: A Scalable Framework for Multi-Graph Pretraining

Graph Learning Transformers

Divyansha Lachi, Mehdi Azabou, Vinam Arora, and Eva Dyer

Preprint, 2024

arXiv

Leveraging Perceiver IO and Relative Position Encodings for Enhanced Node Classification

Graph Learning Transformers

Divyansha Lachi, Vinam Arora, Mehdi Azabou, and Eva Dyer

SIAM Conference on Mathematics of Data Science (MDS24), Mini-symposium: New Frontier of Graph Machine Learning, 2024

Stochastic wiring of cell types enhances fitness by generating phenotypic variability

NeuroAI Reinforcement Learning

Ann Huang, Divyansha Lachi, Augustine N Mavor-Parker, Arna Ghosh, Blake Richards, and Anthony Zador

From Neuroscience to Artificially Intelligent Systems (NAISys), 2024 (Oral)

Abstract: The development of neural connectivity is a critical biological process that underlies the formation of diverse brain circuits and behaviors. It is also a stochastic process, but this stochasticity is often treated as a nuisance to overcome rather than as a functional advantage. Here we use a computational model, in which connection probabilities between discrete cell types are genetically specified, to investigate the benefits of stochasticity in the development of neural wiring. Our model extends the concept of the deterministic genomic bottleneck (DGB) from Koulakov et al., (2021) to a stochastic genomic bottleneck (SGB) framework, incorporating variability in neural wiring. This SGB model can be interpreted as a generalization of Bayesian neural networks, where synaptic weights are sampled from probability distributions rather than fixed values. In supervised learning, SGB outperforms DGB at higher compression levels in zero-shot performance on both MNIST and CIFAR benchmarks and maintains remarkably stable performance across a wide range of compression levels. This suggests that stochastic developmental processes may be particularly beneficial for organisms with large brains, which face a greater challenge in genetically encoding their vast neural connectivity patterns. In reinforcement learning, we demonstrated that SGB endows agents with innate motor control capacities. Meanwhile, individual networks exhibit large phenotypic diversity across varied body morphologies and environmental terrains, with different individuals within a population being better adapted to different tasks, suggesting that developmental noise can lead to specialization in environmental niches. Furthermore, we provided analytical and empirical evidence that the mean performance of a population of networks sampled from the SGB framework exceeds that of the mean network, supporting the idea that stochasticity in neural development enhances overall population fitness. Our findings challenge the traditional view of developmental stochasticity as a bug, proposing instead that it is a crucial feature of biological systems. By leveraging stochasticity, neural development can produce a wide range of phenotypic outcomes, increasing the likelihood of individuals within a population surviving and thriving in dynamic environments.

Encoding innate ability through a genomic bottleneck

NeuroAI Reinforcement Learning

Sergey Shuvaev, Divyansha Lachi, Alexei Koulakov, and Anthony Zador

PNAS, 2024

PNAS bioRxiv

Stochastic Wiring of Cell Types Enhances Fitness by Generating Phenotypic Variability

NeuroAI Reinforcement Learning

Divyansha Lachi, Ann Huang, Augustine N Mavor-Parker, Arna Ghosh, Blake Richards, and Anthony Zador

Preprint, 2024

bioRxiv

Genomic bottleneck approach to faster learning

NeuroAI Reinforcement Learning

Alexei Koulakov, Divyansha Lachi, Sergey Shuvaev, and Anthony Zador

From Neuroscience to Artificially Intelligent Systems (NAISys), 2022 (Oral)

Abstract: The nervous system exhibits a remarkable ability to solve diverse problems with minimal training or experience. Some behaviors are completely innate, requiring no experience whatsoever. Animals manage to function so well with little or no experience soon after birth because they rely heavily on innate mechanisms, encoded in the genome. The genome—shaped by hundreds of millions of years of evolution—encodes blueprints for wiring up their nervous system. However, the information capacity of the genome is orders of magnitude smaller than that needed to specify the connectivity of an arbitrary brain circuit, indicating that the rules encoding circuit formation must fit through a "genomic bottleneck" as they pass from one generation to the next. Here, we formulate the problem of innate behavioral capacity in the context of artificial neural networks in terms of lossy compression of the weight matrix. These compression algorithms are inspired by the neurodevelopmental mechanisms based on chemoaffinity principle known to implement initial brain wiring. We find that several standard network architectures can be compressed by several orders of magnitude, yielding pre-training performance that can approach that of the fully-trained network. Interestingly, for complex but not for simple test problems, the genomic bottleneck algorithm also captures essential features of the circuit, leading to enhanced transfer learning to novel data sets. We observe an enhanced transfer learning both for standard supervised and reinforcement learning tasks. Our results suggest that compressing a neural circuit through the genomic bottleneck serves as a regularizer, enabling evolution to discover simple circuits that can be readily adapted to important real-world tasks. The genomic bottleneck also suggests how innate priors can complement conventional approaches to learning in designing algorithms for artificial intelligence.

Debunking Fake News by Leveraging Speaker Credibility and BERT Based Models

NLP

Thoudam Doren Singh, Divyansha, Apoorva Vikram Singh, Anubhav Sachan and Abdullah Faiz Ur Rahman Khilji

IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT), 2020

Code

A Hybrid Classification Approach using Topic Modeling and Graph Convolution Networks

Graph Learning NLP

Thoudam Doren Singh, Divyansha, Apoorva Vikram Singh, and Abdullah Faiz Ur Rahman Khilji

ComPE2020, IEEE

IEEE Explore Code

Abstract: Text classification has become a key operation in various natural language processing tasks. The efficiency of most classification algorithms predominantly confide in the quality of input features. In this work, we propose a novel multi-class text classification technique that harvests features from two distinct feature extraction methods. Firstly, a structured heterogeneous text graph built based on document-word relations and word co-occurrences is leveraged using a Graph Convolution Network (GCN). Secondly, the documents are topic modeled to use the document-topic score as features into the classification model. The concerned graph is constructed using Point-Wise Mutual Information (PMI) between pair of word co-occurrences and Term Frequency-Inverse Document Frequency (TF-IDF) score for words in the documents for word co-occurrences. Experimentation reveals that our text classification model outperforms the existing techniques for five benchmark text classification data sets.

Predictive Approaches for the UNIX Command Line: Curating and Exploiting Domain Knowledge in Semantics-Deficit Data

NLP

Thoudam Doren Singh, Abdullah Faiz Ur Rahman Khilji, Divyansha, Apoorva Vikram Singh, Surmila Thokchom, and Sivaji Bandyopadhyay

Multimedia Tools and Applications, Springer, 2020

Springer Code

Abstract: The command line has always been the most efficient method to interact with UNIX flavor based systems while offering a great deal of flexibility and efficiency as preferred by professionals. Such a system is based on manually inputting commands to instruct the computing machine to carry out tasks as desired. This human-computer interface is quite tedious especially for a beginner. And hence, the command line has not been able to garner an overwhelming reception from new users. Therefore, to improve user-friendliness and to mark a step towards a more intuitive command line system, we propose two predictive approaches that can benefit all kinds of users specially the novice ones by integrating into the command line interface. These methods are based on deep learning based predictions. The first approach is based on the sequence to sequence (Seq2seq) model with joint learning by leveraging continuous representations of a self-curated exhaustive knowledge base (KB) comprising an all-inclusive command description to enhance the embedding employed in the model. The other is based on the attention-based transformer architecture where a pretrained model is employed. This allows the model to dynamically evolve over time making it adaptable to different circumstances by learning as the system is being used. To reinforce our idea, we have experimented with our models on three major publicly available Unix command line datasets and have achieved benchmark results using GLoVe and Word2Vec embeddings. Our finding is that the transformer based framework performs better on two different datasets of the three in our experiment in a semantic deficit scenario like UNIX command line prediction. However, Seq2seq based model outperforms bidirectional encoder representations from transformers (BERT) based model on a larger dataset.

Seq2Seq and Joint Learning Based Unix Command Line Prediction System

NLP

Thoudam Doren Singh, Abdullah Faiz Ur Rahman Khilji, Divyansha, Apoorva Vikram Singh, Surmila Thokchom, and Sivaji Bandyopadhyay

Preprint, 2020

arXiv Code

Despite being an open-source operating system pioneered in the early 90s, UNIX based platforms have not been able to garner an overwhelming reception from amateur end users. One of the rationales for under popularity of UNIX based systems is the steep learning curve corresponding to them due to extensive use of command line interface instead of usual interactive graphical user interface. In past years, the majority of insights used to explore the concern are eminently centered around the notion of utilizing chronic log history of the user to make the prediction of successive command. The approaches directed at anatomization of this notion are predominantly in accordance with Probabilistic inference models. The techniques employed in past, however, have not been competent enough to address the predicament as legitimately as anticipated. Instead of deploying usual mechanism of recommendation systems, we have employed a simple yet novel approach of Seq2seq model by leveraging continuous representations of self-curated exhaustive Knowledge Base (KB) to enhance the embedding employed in the model. This work describes an assistive, adaptive and dynamic way of enhancing UNIX command line prediction systems. Experimental methods state that our model has achieved accuracy surpassing mixture of other techniques and adaptive command line interface mechanism as acclaimed in the past.