> ## Documentation Index
> Fetch the complete documentation index at: https://upstash-vector.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

# Semantic Search with BERT

In this tutorial, we will implement semantic search on a sample dataset. We will utilize DistilBERT on HuggingFace for vectorization, a lighter and faster version of BERT that maintains similar accuracy. For storing and querying the vectors, Upstash Vector will be used.

Here is the outline:

1- Create an index on Upstash Vector and install the required dependencies.

2- Download a sample dataset, which consists of newsgroup documents, available at [http://qwone.com/\~jason/20Newsgroups/](http://qwone.com/~jason/20Newsgroups/).

3- Vectorize the documents using DistilBERT.

4- Insert the vectors into the database.

5- Conduct a test query.

You can find the full tutorial and code in the [notebook here](https://colab.research.google.com/drive/1xXSLtG2uItaBWAxXYgzwa5jo669mDk2p?authuser=1).
