Избранные статьи о плотном поиске
## Introduction about Dense Retrieval - Lecture Notes on Neural Information Retrieval. - Pretrained Transformers for Text Ranking: BERT and Beyond. - Semantic Models for the First-stage Retrieval: A Comprehensive Review. - Pre-training Methods in Information Retrieval. - Sparse, Dense, and Attentional Representations for Text Retrieval. - Low-Resource Dense Retrieval for Open-Domain Question Answering: A Comprehensive Survey. ## Universal Sentence Embedding Learning - Sentence-T5: Scalable Sentence Encoders from Pre-trained Text-to-Text Models. - SimCSE: Simple Contrastive Learning of Sentence Embeddings. - Text and Code Embeddings by Contrastive Pre-Training. ## Dense Retrieval Improvement on Query Side - Query Embedding Pruning for Dense Retrieval. - Predicting Efficiency/Effectiveness Trade-offs for Dense vs. Sparse Retrieval Strategy Selection. - Improving Query Representations for Dense Retrieval with Pseudo Relevance Feedback. - Pseudo-Relevance Feedback for Multiple Representation Dense Retrieval. - Dealing with Typos for BERT-based Passage Retrieval and Ranking. - Analysing the Robustness of Dual Encoders for Dense Retrieval Against Misspellings. ## Dense Retrieval Improvement On Document Side - Dense Hierarchical Retrieval for Open-Domain Question Answering. - Improving Document Representations by Generating Pseudo Query Embeddings for Dense Retrieval. - Sentence-aware Contrastive Learning for Open-Domain Passage Retrieval. - UniK-QA: Unified Representations of Structured and Unstructured Knowledge for Open-Domain Question Answering. - Zero-shot Neural Passage Retrieval via Domain-targeted Synthetic Question Generation. - Multi-modal Retrieval of Tables and Texts Using Tri-encoder Models. - Open Domain Question Answering over Tables via Dense Retrieval. - PARM: A Paragraph Aggregation Retrieval Model for Dense Document-to-Document Retrieval. - Sentence-aware Contrastive Learning for Open-Domain Passage Retrieval. - Aggretriever: A Simple Approach to Aggregate Textual Representation for Robust Dense Passage Retrieval. - LED: Lexicon-Enlightened Dense Retriever for Large-Scale Retrieval. - Learning Diverse Document Representations with Deep Query Interactions for Dense Retrieval. - Multi-View Document Representation Learning for Open-Domain Dense Retrieval. - Augmenting Document Representations for Dense Retrieval with Interpolation and Perturbation. - Masked Autoencoders As The Unified Learners For Pre-Trained Sentence Representation. - BERT Rankers are Brittle: a Study using Adversarial Document Perturbations. - Autoregressive Search Engines: Generating Substrings as Document Identifiers. ## Improve Ranking Performance - ColBERTv2: Effective and Efficient Retrieval via Lightweight Late Interaction. - Condenser: a Pre-training Architecture for Dense Retrieval. - Salient Phrase Aware Dense Retrieval: Can a Dense Retriever Imitate a Sparse One? - BERT-based Dense Retrievers Require Interpolation with BM25 for Effective Passage Retrieval. ## Few-Shot Learning about Dense Retrieval - Promptagator: Few-shot Dense Retrieval From 8 Examples. ## Dataset - BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models. - mMARCO: A Multilingual Version of the MS MARCO Passage Ranking Dataset. - CCQA: A New Web-Scale Question Answering Dataset for Model Pre-Training. - Multi-CPR: A Multi Domain Chinese Dataset for Passage Retrieval. ## Considering Click Behaviors - Uni-Retriever: Towards Learning The Unified Embedding Based Retriever in Bing Sponsored Search.