# Conversational Search

In this notebook you will implement the following steps:

- **Answer selection + evaluation**: Implement a *search-based* conversation framework evaluation framework to evaluate conversation topics made up of conversation turns.
- **Answer ranking**: Implement a *re-ranking method* to sort the initial search results. Evaluate the re-ranked results.
- **Conversation context**: Implement a conversational context modeling method to keep track of the conversation state. 

Submission dates:
- **20 October**: retrieval + evaluation
- **20 November**: pass
age re-ranking
- **20 December**: conversation state tracking

## Test bed and conversation topics
The TREC CAST corpus (http://www.treccast.ai/) for Conversational Search is indexed in this cluster and available to be searched behind an ElasticSearch API.

The queries and the relevance judgments are available through class `ConvSearchEvaluation`:

# Google Colab Setup

The following steps are already implemented in the cell bellow. You need to download the starting project folder, upload it, adjust the paths, and finally run the notebook.


1.   Download the shared project folder as a zip;
2.   Unzip and re-upload to a folder of your own GDrive;
3.   Mount your GDrive on the Colab working environment;

Note: You will be asked to complete a Google Authorization procedure by following a link and pasting a code on the notebook.

4.   Copy the contents from the folder you uploaded to the Colab working dir;
5.   Add sys path locations to run aux Python scripts;
6.   Install dependencies.

After going though all these steps you should be able to run all the cells in the notebook.

In [1]:
# Colab Setup
# Mount your Google Drive
from google.colab import drive
drive.mount('/content/drive')

# After downloading the shared starting point folder as a Zip
# Unzip it and re-upload it to a location on your GDrive

# This command copies the contents from the folder you uploaded to GDrive, to the colab working dir
!cp -r /content/drive/My\ Drive/faculdade/fct-miei/04_ano4_\(year4\)/semestre1/ri/ProjectoRI2020 /content

# Add working dir to the sys path, so that we can find the aux python files when running the Notebook
import sys
if not '/content/ProjectoRI2020' in sys.path:
  sys.path += ['/content/ProjectoRI2020']

# Finally install required dependencies to run the notebook
!pip install elasticsearch
!pip install bert-serving-client

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [2]:
PROJ_DIR = "/content/drive/My Drive/faculdade/fct-miei/04_ano4_(year4)/semestre1/ri/infos_projeto"

RUN_PHASE = 3
UPDATE_ELASTICSEARCH_RESULTS = False
UPDATE_BERT_RESULTS = False
REL_DOCS_PER_TURN = 10
IDX = {
    'TRAIN': (1, 2, 4, 7, 15, 17,18,22,23,24,25,27,30),
    'TEST': (31, 32, 33, 34, 37, 40, 49, 50, 54, 56, 58, 59, 61, 67, 68, 69, 75, 77, 78, 79)
}
SET_NAME = {
    'TRAIN': "train",
    'TEST': "test"
}

if RUN_PHASE == 1:
  if REL_DOCS_PER_TURN == 10:
    !rm -rf /content/drive/My\ Drive/faculdade/fct-miei/04_ano4_\(year4\)/semestre1/ri/infos_projeto/plots/train/10/*
    !rm -rf /content/drive/My\ Drive/faculdade/fct-miei/04_ano4_\(year4\)/semestre1/ri/infos_projeto/plots/test/10/*
  if REL_DOCS_PER_TURN == 100:
    !rm -rf /content/drive/My\ Drive/faculdade/fct-miei/04_ano4_\(year4\)/semestre1/ri/infos_projeto/plots/train/100/*
    !rm -rf /content/drive/My\ Drive/faculdade/fct-miei/04_ano4_\(year4\)/semestre1/ri/infos_projeto/plots/test/100/*
  if REL_DOCS_PER_TURN == 1000:
    !rm -rf /content/drive/My\ Drive/faculdade/fct-miei/04_ano4_\(year4\)/semestre1/ri/infos_projeto/plots/train/1000/*
    !rm -rf /content/drive/My\ Drive/faculdade/fct-miei/04_ano4_\(year4\)/semestre1/ri/infos_projeto/plots/test/1000/*

In [3]:
import TRECCASTeval as trec
import numpy as np

from sklearn.metrics.pairwise import cosine_similarity

import ElasticSearchSimpleAPI as es
import numpy as np

import pprint as pprint

test_bed = trec.ConvSearchEvaluation()

print()
print("========================================== Training conversations =====")
topics = {}
for topic in test_bed.train_topics:
    conv_id = topic['number']

    if conv_id not in (1, 2, 4, 7, 15, 17,18,22,23,24,25,27,30):
        continue

    print()
    print(conv_id, "  ", topic['title'])

    for turn in topic['turn']:
        turn_id = turn['number']
        utterance = turn['raw_utterance']
        topic_turn_id = '%d_%d'% (conv_id, turn_id)
        
        print(topic_turn_id, utterance)
        topics[topic_turn_id] = utterance

print()
print("========================================== Test conversations =====")
for topic in test_bed.test_topics:
    conv_id = topic['number']

    if conv_id not in (31, 32, 33, 34, 37, 40, 49, 50, 54, 56, 58, 59, 61, 67, 68, 69, 75, 77, 78, 79):
        continue

    print()
    print(conv_id, "  ", topic['title'])

    for turn in topic['turn']:
        turn_id = turn['number']
        utterance = turn['raw_utterance']
        topic_turn_id = '%d_%d'% (conv_id, turn_id)
        
        print(topic_turn_id, utterance)
        topics[topic_turn_id] = utterance




1    Career choice for Nursing and Physician's Assistant
1_1 What is a physician's assistant?
1_2 What are the educational requirements required to become one?
1_3 What does it cost?
1_4 What's the average starting salary in the UK?
1_5 What about in the US?
1_6 What school subjects are needed to become a registered nurse?
1_7 What is the PA average salary vs an RN?
1_8 What the difference between a PA and a nurse practitioner?
1_9 Do NPs or PAs make more?
1_10 Is a PA above a NP?
1_11 What is the fastest way to become a NP?
1_12 How much longer does it take to become a doctor after being an NP?

2    Goat breeds
2_1 What are the main breeds of goat?
2_2 Tell me about boer goats.
2_3 What breed is good for meat?
2_4 Are angora goats good for it?
2_5 What about boer goats?
2_6 What are pygmies used for?
2_7 What is the best for fiber production?
2_8 How long do Angora goats live?
2_9 Can you milk them?
2_10 How many can you have per acre?
2_11 Are they profitable?

4    The Neolithic 

Search example:

In [4]:
#elastic = es.ESSimpleAPI()
#results = elastic.search_body(topics['33_1'], numDocs = 10)
#results = elastic.get_doc_body(topics['33_1']['_id'])
#print(results)

## Retrieval with the training conversations


The ElasticSearchSimpleAPI notebook illustrates how to use ElasticSearch. Use this API to retrieve the top 100 ranked passages for each conversation turn. 

To evaluate the results you should use the provided `ConvSearchEvaluation` class. Examine and discuss the recall metric results. In terms of metrics, discuss what should be your goals for each step of the project.

In [5]:
"""import TRECCASTeval as trec
import numpy as np

import ElasticSearchSimpleAPI as es
import numpy as np

import pprint as pprint
import plots as plots

elastic = es.ESSimpleAPI()
test_bed = trec.ConvSearchEvaluation()

# total
_recall = 0
_p10 = 0
_ndcg5 = 0

# counters
_ntopics = 0
_nturns = 0
_ntotalTurns = 0

# metrics
_p10s = np.array([])
_aps = np.array([])
_ndcg5s = np.array([])

# conv and turns numbers and names
_convNumbers = []
_convNames = np.array([])

for topic in test_bed.train_topics:
    conv_id = topic['number']
    if conv_id not in (1, 2, 4, 7, 15, 17,18,22,23,24,25,27,30):
        continue
    _convNumbers.append(conv_id)
    _convNames = np.append(_convNames, str(conv_id) + " " + topic['title'])
        
    for turn in topic['turn'][:8]:
        turn_id = turn['number']
        utterance = turn['raw_utterance']
        topic_turn_id = '%d_%d'% (conv_id, turn_id)
        
        aux = test_bed.relevance_judgments.loc[test_bed.relevance_judgments['topic_turn_id'] == (topic_turn_id)]
        num_rel = aux.loc[aux['rel'] != 0]['docid'].count()

        _convNames = np.append(_convNames, topic_turn_id + " " + utterance)
        
        if num_rel == 0:
            _p10s = np.append(_p10s, np.nan)
            _aps = np.append(_aps, np.nan)
            _ndcg5s = np.append(_ndcg5s, np.nan)
            _ntotalTurns += 1
            continue

        result = elastic.search_body(query=utterance, numDocs = 100)

        if np.size(result) == 0 or num_rel == 0:
            _p10s = np.append(_p10s, 0.0)
            _aps = np.append(_aps, 0.0)
            _ndcg5s = np.append(_ndcg5s, 0.0)
            _ntotalTurns += 1
            print(topic_turn_id, utterance, num_rel, "NO RESULTS")
            continue
        else:
            print(topic_turn_id, utterance, num_rel)

        [p10, recall, ap, ndcg5] = test_bed.eval(result[['_id','_score']], topic_turn_id)

        print('P@10=', p10, '  Recall=', recall, ' AP=', ap, '  NDCG@5=',ndcg5)
        # total
        _recall = _recall + recall
        _p10 = _p10 + p10
        _ndcg5 = _ndcg5 + ndcg5

        # metrics
        _p10s = np.append(_p10s, p10)
        _aps = np.append(_aps, ap)
        _ndcg5s = np.append(_ndcg5s, ndcg5)
        
        # counters
        _nturns = _nturns + 1
        _ntotalTurns += 1
    
    if _ntotalTurns % 8 != 0:
      for n in range((_ntotalTurns % 8) + 1, 9):
        _ntotalTurns += 1
        _p10s = np.append(_p10s, np.nan)
        _aps = np.append(_aps, np.nan)
        _ndcg5s = np.append(_ndcg5s, np.nan)
        _convNames = np.append(_convNames, "NO RESULT")

    # counters
    _ntopics += 1

# metrics
_p10s = np.reshape(_p10s, (_ntopics, 8))
_aps = np.reshape(_aps, (_ntopics, 8))
_ndcg5s = np.reshape(_ndcg5s, (_ntopics, 8))
# convs and turns names
_convNames = np.reshape(_convNames, (_ntopics, 9))

# generate plots
plots.plotMetricAlongConversation("Average Precision", [_aps], ["LMD"], _convNumbers)
plots.plotMetricAlongConversation("normalized Discounted Cumulative Gain", [_ndcg5s], ["LMD"], _convNumbers)
plots.plotMetricEachConversation("Average Precision", [_aps], ["LMD"], _convNumbers, _convNames)
plots.plotMetricEachConversation("normalized Discounted Cumulative Gain", [_ndcg5s], ["LMD"], _convNumbers, _convNames)

# total mean
_p10 = _p10/_nturns
_recall = _recall/_nturns
_ndcg5 = _ndcg5/_nturns



print()
print('P@10=', _p10, '  Recall=', _recall, '  NDCG@5=', _ndcg5)"""


import TRECCASTeval as trec
import numpy as np

import ElasticSearchSimpleAPI as es

import pprint as pprint

import project as project

elastic = es.ESSimpleAPI() if UPDATE_ELASTICSEARCH_RESULTS else None
test_bed = trec.ConvSearchEvaluation()

if RUN_PHASE == 1:
  project.project(REL_DOCS_PER_TURN, UPDATE_ELASTICSEARCH_RESULTS, elastic, test_bed, test_bed.train_topics, test_bed.relevance_judgments, IDX["TRAIN"], SET_NAME["TRAIN"], plots=True)

## Retrieval with the test conversations

In [6]:
"""import TRECCASTeval as trec
import numpy as np

from bert_serving.client import BertClient
from sklearn.metrics.pairwise import cosine_similarity

import ElasticSearchSimpleAPI as es
import numpy as np

import pprint as pprint

elastic = es.ESSimpleAPI()
test_bed = trec.ConvSearchEvaluation()

_recall = 0
_p10 = 0
_ndcg5 = 0
_nturns = 0
for topic in test_bed.test_topics:
    
    conv_id = topic['number']
    if conv_id not in (31, 32, 33, 34, 37, 40, 49, 50, 54, 56, 58, 59, 61, 67, 68, 69, 75, 77, 78, 79):
        continue

    for turn in topic['turn']:
        turn_id = turn['number']
        utterance = turn['raw_utterance']
        topic_turn_id = '%d_%d'% (conv_id, turn_id)
        
        aux = test_bed.test_relevance_judgments.loc[test_bed.test_relevance_judgments['topic_turn_id'] == (topic_turn_id)]
        num_rel = aux.loc[aux['rel'] != 0]['docid'].count()
        
        if num_rel == 0:
            continue

        result = elastic.search_body(query=utterance, numDocs = 100)

        if np.size(result) == 0 or num_rel == 0:
            print(topic_turn_id, utterance, num_rel, "NO RESULTS")
            continue
        else:
            print(topic_turn_id, utterance, num_rel)

        [p10, recall, ap, ndcg5] = test_bed.eval(result[['_id','_score']], topic_turn_id)

#        print('P10=', p10, '  Recall=', recall, '  NDCG=',ndcg)
        _recall = _recall + recall
        _p10 = _p10 + p10
        _ndcg5 = _ndcg5 + ndcg5
        
        _nturns = _nturns + 1

_p10 = _p10/_nturns
_recall = _recall/_nturns
_ndcg5 = _ndcg5/_nturns

print()
print('P10=', _p10, '  Recall=', _recall, '  NDCG@5', _ndcg5)
"""

import TRECCASTeval as trec
import numpy as np

import ElasticSearchSimpleAPI as es

import pprint as pprint

import project as project

elastic = es.ESSimpleAPI() if UPDATE_ELASTICSEARCH_RESULTS else None
test_bed = trec.ConvSearchEvaluation()

if RUN_PHASE == 1:
  project.project(REL_DOCS_PER_TURN, UPDATE_ELASTICSEARCH_RESULTS, elastic, test_bed, test_bed.test_topics, test_bed.test_relevance_judgments, IDX["TEST"], SET_NAME["TEST"], plots=True)


# download zip and download plots
#!zip -r plots.zip /content/plots

#from google.colab import files
#files.download("plots.zip")

## Passage re-Ranking
The Passage Ranking notebook example illustrates how to use the BERT service to compute the similarity between sentences. Using the BERT service, improve a passage ranking method to rerank the initial retrieval step.

To evaluate the results you should use the provided `ConvSearchEvaluation` class.


In [7]:
!pip install transformers
from transformers import BertTokenizerFast, BertModel
from sklearn.linear_model import LogisticRegression
import torch
import plots as plots

if RUN_PHASE == 2:
  bert_model_name = 'nboost/pt-bert-base-uncased-msmarco'
  tokenizer = BertTokenizerFast.from_pretrained(bert_model_name)
  device = torch.device("cuda")
  model = BertModel.from_pretrained(bert_model_name, return_dict=True)
  model = model.to(device)

  aps1, ndcg5s1, recalls1, precisions1, convNumbers, convNames = project.project(REL_DOCS_PER_TURN, UPDATE_ELASTICSEARCH_RESULTS, elastic, test_bed, test_bed.test_topics, test_bed.test_relevance_judgments, IDX["TEST"], SET_NAME["TEST"])

  classifier = project.project2Train(REL_DOCS_PER_TURN, UPDATE_BERT_RESULTS, tokenizer, model, device, test_bed.train_topics, test_bed.relevance_judgments, IDX["TRAIN"], SET_NAME["TRAIN"])
  aps2, ndcg5s2, recalls2, precisions2 = project.project2Test(REL_DOCS_PER_TURN, UPDATE_BERT_RESULTS, classifier, tokenizer, model, device, test_bed.test_topics, test_bed, test_bed.test_relevance_judgments, IDX["TEST"], SET_NAME["TEST"])

  methods = ["LMD", "Re-Ranking"]
  
  plots.plotMetricAlongConversation(PROJ_DIR, REL_DOCS_PER_TURN, SET_NAME["TEST"], "Average Precision", [aps1, aps2], methods, convNumbers)
  plots.plotMetricAlongConversation(PROJ_DIR, REL_DOCS_PER_TURN, SET_NAME["TEST"], "normalized Discounted Cumulative Gain", [ndcg5s1, ndcg5s2], methods, convNumbers)
  plots.plotMetricEachConversation(PROJ_DIR, REL_DOCS_PER_TURN, SET_NAME["TEST"], "Average Precision", [aps1, aps2], methods, convNumbers, convNames)
  plots.plotMetricEachConversation(PROJ_DIR, REL_DOCS_PER_TURN, SET_NAME["TEST"], "normalized Discounted Cumulative Gain", [ndcg5s1, ndcg5s2], methods, convNumbers, convNames)
  plots.plotPrecisionRecall(PROJ_DIR, REL_DOCS_PER_TURN, SET_NAME["TEST"], [recalls1, recalls2], [precisions1, precisions2], methods, convNumbers)



## Conversation Context Modeling

Conversation State Tracking example ilustrates how to use the 

To evaluate the results you should use the provided `ConvSearchEvaluation` class.


In [8]:
if RUN_PHASE == 3:
  !pip install transformers
  !pip install spacy
  !python -m spacy download en_core_web_sm
  %tensorflow_version 2.x
  !pip install t5==0.5.0
  import tensorflow as tf
  import tensorflow_text
  import pprint
  import spacy

  !rm -rf /content/t5-canard
  !cp -r /content/drive/My\ Drive/faculdade/fct-miei/04_ano4_\(year4\)/semestre1/ri/infos_projeto/t5-canard.zip /content/t5-canard.zip
  !unzip /content/t5-canard.zip

  import tensorflow as tf
  import tensorflow_text

  class QueryRewriterT5:
    def __init__(self, model_path="/content/t5-canard"):
      """
        Loads T5 model for prediction
        Returns the model
      """
      if tf.executing_eagerly():
          print("Loading SavedModel in eager mode.")
          imported = tf.saved_model.load(model_path, ["serve"])
          self.t5_model = lambda x: imported.signatures['serving_default'](tf.constant(x))['outputs'].numpy()
      else:
          print("Loading SavedModel in tf 1.x graph mode.")
          tf.compat.v1.reset_default_graph()
          sess = tf.compat.v1.Session()
          meta_graph_def = tf.compat.v1.saved_model.load(sess, ["serve"], model_path)
          signature_def = meta_graph_def.signature_def["serving_default"]
          self.t5_model = lambda x: sess.run(
              fetches=signature_def.outputs["outputs"].name,
              feed_dict={signature_def.inputs["input"].name: x}
          )
    
    """
      query: str - the query string to be rewritten using T5
      ctx_list: list - A list of strings containing the turns or text to give context to T5
      Returns a string with the rewritten query
    """
    def rewrite_query_with_T5(self, _curr_query, _ctx_list):
      _t5_query = '{} [CTX] '.format(_curr_query) + ' [TURN] '.join(_ctx_list)
      print("Query and context: {}".format(_t5_query))
      return self.t5_model([_t5_query])[0].decode('utf-8')

    """
      queries_list: list - A list of strings containing the raw utterances ordered from first to last
      Returns a list of strings with the rewritten queries
    """
    def rewrite_dialog_with_T5(self, _queries_list):
      _rewritten_queries_list=[]
      for i in range(len(_queries_list)):
        _current_query = _queries_list[i]
        _rewritten_query = self.rewrite_query_with_T5(_current_query, _queries_list[:i])
        print("Rewritten query: {}\n".format(_rewritten_query))
        _rewritten_queries_list.append(_rewritten_query)
      return _rewritten_queries_list



  elastic = es.ESSimpleAPI()

  bert_model_name = 'nboost/pt-bert-base-uncased-msmarco'
  tokenizer = BertTokenizerFast.from_pretrained(bert_model_name)
  device = torch.device("cuda")
  model = BertModel.from_pretrained(bert_model_name, return_dict=True)
  model = model.to(device)

  nlp = spacy.load('en_core_web_sm')
  rewriter = QueryRewriterT5('/content/t5-canard')

  _, _, _, _, convNumbers, convNames = project.project(REL_DOCS_PER_TURN, UPDATE_ELASTICSEARCH_RESULTS, elastic, test_bed, test_bed.test_topics, test_bed.test_relevance_judgments, IDX["TEST"], SET_NAME["TEST"])
  classifier = project.project2Train(REL_DOCS_PER_TURN, UPDATE_BERT_RESULTS, tokenizer, model, device, test_bed.train_topics, test_bed.relevance_judgments, IDX["TRAIN"], SET_NAME["TRAIN"])
  project.project3(REL_DOCS_PER_TURN, UPDATE_ELASTICSEARCH_RESULTS, rewriter, tokenizer, model, device, nlp, classifier, elastic, test_bed, test_bed.test_topics, test_bed.test_relevance_judgments, IDX["TEST"], SET_NAME["TEST"], convNumbers, convNames)

Output hidden; open in https://colab.research.google.com to view.