Â
Machine learning is a subset of artificial intelligence that could bring value to the business by providing efficiency and predictive insight. It’s a valuable tool for any business.
We know that last year was full of machine learning breakthrough, and this year is not any different. There is just so much to learn about.
With so much to learn, I select a few papers in 2024 that you should read to improve your knowledge.
What are these papers? Let’s get into it.
Â
HyperFast: Instant Classification for Tabular Data
Â
HyperFast is a meta-trained hypernetwork model developed by Bonet et al. (2024) research. It’s designed to provide a classification model that is capable of instant classification of tabular data in a single forward pass.
The author stated that the HyperFast could generate a task-specific neural network for an unseen dataset that can be directly used for classification prediction and eliminate the need for training a model. This approach would significantly reduce the computational demands and time required to deploy machine learning models.
The HyperFast Framework shows that the input data is transformed through standardization and dimensionality reduction, followed by a sequence of hypernetworks that produce weights for the network’s layers, which include a nearest neighbor-based classification bias.
Overall, the results show that HyperFast performed excellently. It is faster than many classical methods without the need for fine-tuning. The paper concludes that HyperFast could become a new approach that can be applied in many real-life cases.
Â
EasyRL4Rec: A User-Friendly Code Library for Reinforcement Learning Based Recommender Systems
Â
The next paper we will discuss is about a new library proposed by Yu et al. (2024) called EasyRL4Rec.The point of the paper is about a user-friendly code library designed for developing and testing Reinforcement Learning (RL)-based Recommender Systems (RSs) called EasyRL4Rec.
The library offers a modular structure with four core modules (Environment, Policy, StateTracker, and Collector), each addressing different stages of the Reinforcement Learning process.
The overall structure shows that it works around the core modules for the Reinforcement Learning workflow—including Environments (Envs) for simulating user interactions, a Collector for gathering data from interactions, a State Tracker for creating state representations, and a Policy module for decision-making. It also includes a data layer for managing datasets and an Executor layer with a Trainer Evaluator for overseeing the learning and performance assessment of the RL agent.
The author concludes that EasyRL4Rec contains a user-friendly framework that could address practical challenges in RL for recommender systems.
Â
Label Propagation for Zero-shot Classification with Vision-Language Models
Â
The paper by Stojnic et al. (2024) introduces a technique called ZLaP, which stands for Zero-shot classification with Label Propagation. It’s an enhancement for the Zero-Shot Classification of Vision Language Models by utilizing geodesic distances for classification.
As we know Vision Models such as GPT-4V or LLaVa, are capable of zero-shot learning, which can perform classification without labeled images. However, it can still be enhanced further which is why the research group developed the ZLaP technique.
The ZLaP core idea is to utilize label propagation on a graph-structured dataset comprising both image and text nodes. ZLaP calculates geodesic distances within this graph to perform classification. The method is also designed to handle the dual modalities of text and images.
Performance-wise, ZLaP shows results that consistently outperform other state-of-the-art methods in zero-shot learning by leveraging both transductive and inductive inference methods across 14 different dataset experiments.
Overall, the technique significantly improved classification accuracy across multiple datasets, which showed promise for the ZLaP technique in the Vision Language Model.
Â
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention
Â
The fourth paper we will discuss is by Munkhdalai et al.(2024). Their paper introduces a method to scale Transformer-based Large Language Models (LLMs) that could handle infinitely long inputs with a limited computational capability called Infini-attention.
The Infini-attention mechanism integrates a compressive memory system into the traditional attention framework. Combining a traditional causal attention model with compressive memory can store and update historical context and efficiently process the extended sequences by aggregating long-term and local information within a transformer network.
Overall, the technique performs superior tasks involving long-context language modelings, such as passkey retrieval from long sequences and book summarization, compared to currently available models.
The technique could provide many future approaches, especially to applications that require the processing of extensive text data.
Â
AutoCodeRover: Autonomous Program Improvement
Â
The last paper we will discuss is by Zhang et al. (2024). The main focus of this paper is on the tool called AutoCodeRover, which utilizes Large Language Models (LLMs) that are able to perform sophisticated code searches to automate the resolution of GitHub issues, mainly bugs, and feature requests. By using LLMs to parse and understand issues from GitHub, AutoCodeRover can navigate and manipulate the code structure more effectively than traditional file-based approaches to solve the issues.
There are two main stages of how AutoCodeRover works: Context Retrieval Stage and Patch Generation target. It works by analyzing the results to check if enough information has been gathered to identify the buggy parts of the code and attempts to generate a patch to fix the issues.
The paper shows that AutoCodeRover improves performance compared to previous methods. For example, it solved 22-23% of issues from the SWE-bench-lite dataset, which resolved 67 issues in an average time of less than 12 minutes each. This is an improvement as on average it could take two days to solve.
Overall, the paper shows promise as AutoCodeRover is capable of significantly reducing the manual effort required in program maintenance and improvement tasks.
Â
Conclusion
Â
There are many machine learning papers to read in 2024, and here are my recommendation papers to read:
- HyperFast: Instant Classification for Tabular Data
- EasyRL4Rec: A User-Friendly Code Library for Reinforcement Learning Based Recommender Systems
- Label Propagation for Zero-shot Classification with Vision-Language Models
- Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention
- AutoCodeRover: Autonomous Program Improvement
I hope it helps!
Â
Â
Cornellius Yudha Wijaya is a data science assistant manager and data writer. While working full-time at Allianz Indonesia, he loves to share Python and data tips via social media and writing media. Cornellius writes on a variety of AI and machine learning topics.