Плотный поисковый список для чтения

Избранные статьи о плотном поиске

## Introduction about Dense Retrieval
- Lecture Notes on Neural Information Retrieval.
- Pretrained Transformers for Text Ranking: BERT and Beyond.
- Semantic Models for the First-stage Retrieval: A Comprehensive Review.
- Pre-training Methods in Information Retrieval.
- Sparse, Dense, and Attentional Representations for Text Retrieval.
- Low-Resource Dense Retrieval for Open-Domain Question Answering: A Comprehensive Survey.

## Universal Sentence Embedding Learning
- Sentence-T5: Scalable Sentence Encoders from Pre-trained Text-to-Text Models.
- SimCSE: Simple Contrastive Learning of Sentence Embeddings.
- Text and Code Embeddings by Contrastive Pre-Training.

## Dense Retrieval Improvement on Query Side
- Query Embedding Pruning for Dense Retrieval.
- Predicting Efficiency/Effectiveness Trade-offs for Dense vs. Sparse Retrieval Strategy Selection.
- Improving Query Representations for Dense Retrieval with Pseudo Relevance Feedback.
- Pseudo-Relevance Feedback for Multiple Representation Dense Retrieval.
- Dealing with Typos for BERT-based Passage Retrieval and Ranking.
- Analysing the Robustness of Dual Encoders for Dense Retrieval Against Misspellings.

## Dense Retrieval Improvement On Document Side
- Dense Hierarchical Retrieval for Open-Domain Question Answering.
- Improving Document Representations by Generating Pseudo Query Embeddings for Dense Retrieval.
- Sentence-aware Contrastive Learning for Open-Domain Passage Retrieval.
- UniK-QA: Unified Representations of Structured and Unstructured Knowledge for Open-Domain Question Answering.
- Zero-shot Neural Passage Retrieval via Domain-targeted Synthetic Question Generation.
- Multi-modal Retrieval of Tables and Texts Using Tri-encoder Models.
- Open Domain Question Answering over Tables via Dense Retrieval.
- PARM: A Paragraph Aggregation Retrieval Model for Dense Document-to-Document Retrieval.
- Sentence-aware Contrastive Learning for Open-Domain Passage Retrieval.
- Aggretriever: A Simple Approach to Aggregate Textual Representation for Robust Dense Passage Retrieval.
- LED: Lexicon-Enlightened Dense Retriever for Large-Scale Retrieval.
- Learning Diverse Document Representations with Deep Query Interactions for Dense Retrieval.
- Multi-View Document Representation Learning for Open-Domain Dense Retrieval.
- Augmenting Document Representations for Dense Retrieval with Interpolation and Perturbation.
- Masked Autoencoders As The Unified Learners For Pre-Trained Sentence Representation.
- BERT Rankers are Brittle: a Study using Adversarial Document Perturbations.
- Autoregressive Search Engines: Generating Substrings as Document Identifiers.

## Improve Ranking Performance
- ColBERTv2: Effective and Efficient Retrieval via Lightweight Late Interaction.
- Condenser: a Pre-training Architecture for Dense Retrieval.
- Salient Phrase Aware Dense Retrieval: Can a Dense Retriever Imitate a Sparse One?
- BERT-based Dense Retrievers Require Interpolation with BM25 for Effective Passage Retrieval.

## Few-Shot Learning about Dense Retrieval
- Promptagator: Few-shot Dense Retrieval From 8 Examples.

## Dataset
- BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models.
- mMARCO: A Multilingual Version of the MS MARCO Passage Ranking Dataset.
- CCQA: A New Web-Scale Question Answering Dataset for Model Pre-Training.
- Multi-CPR: A Multi Domain Chinese Dataset for Passage Retrieval.

## Considering Click Behaviors
- Uni-Retriever: Towards Learning The Unified Embedding Based Retriever in Bing Sponsored Search.

Плотный поисковый список для чтения

Вопросы по теме