PostgreSQL
 sql >> Base de données >  >> RDS >> PostgreSQL

Trouver des phrases avec deux mots adjacents dans Pg

Solution plus simple, mais ne donne des résultats que lorsqu'il n'y a pas d'espace dans item.position s :

SELECT DISTINCT sentence.sentenceid 
  FROM sentence 
  JOIN item ON sentence.sentenceid = item.sentenceid
  JOIN word ON item.wordid = word.wordid
  JOIN item AS next_item ON sentence.sentenceid = next_item.sentenceid
                        AND next_item.position = item.position + 1
  JOIN word AS next_word ON next_item.wordid = next_word.wordid
 WHERE word.spelling = 'word1'
   AND next_word.spelling = 'word2'

Solution plus générale, utilisant les fonctions de fenêtre :

SELECT DISTINCT sentenceid
FROM (SELECT sentence.sentenceid,
             word.spelling,
             lead(word.spelling) OVER (PARTITION BY sentence.sentenceid
                                           ORDER BY item.position)
        FROM sentence 
        JOIN item ON sentence.sentenceid = item.sentenceid
        JOIN word ON item.wordid = word.wordid) AS pairs
 WHERE spelling = 'word1'
   AND lead = 'word2'

Modifier :Également solution générale (écarts autorisés), mais avec jointures uniquement :

SELECT DISTINCT sentence.sentenceid
  FROM sentence 
  JOIN item ON sentence.sentenceid = item.sentenceid
  JOIN word ON item.wordid = word.wordid
  JOIN item AS next_item ON sentence.sentenceid = next_item.sentenceid
                        AND next_item.position > item.position
  JOIN word AS next_word ON next_item.wordid = next_word.wordid
  LEFT JOIN item AS mediate_word ON sentence.sentenceid = mediate_word.sentenceid
                                AND mediate_word.position > item.position
                                AND mediate_word.position < next_item.position
 WHERE mediate_word.wordid IS NULL
   AND word.spelling = 'word1'
   AND next_word.spelling = 'word2'