“From Wikipedia”
Ternary search tree is a special trie data structure where the child nodes of a standard trie are ordered as a binary search tree. Ternary search trees are effective for many applications involving mapping strings to other values as well as completing near-neighbor lookups and other string-related queries. In addition, ternary search trees are a more space efficient (at the cost of speed) equivalent of typical tries. Further, ternary search trees can be used any time a hashtable would be used to store strings.
The figure below shows a ternary search tree with the strings “as”, “at”, “cup”, “cute”, “he”, “i” and “us”:
|
1 2 3 4 5 6 7 |
c
/ | \
a u h
| | | \
t t e u
/ / | / |
s p e i s |
Thanks to Wikipedia.
Python is an easy language to implement ideas and algorithms. Here is a simple implementation of Ternary search tree.
Lets roll..!!
First thing we need is a class to represent one character or one node. Let’s call it Node and define some member function and variables for it.
|
1 2 3 4 5 6 7 |
class Node:
def __init__(self,ch,flag): # Constructor for Node Object
self.ch = ch
self.flag = flag
self.left = 0
self.right = 0
self.center = 0 |
(You better read the theory part of Ternary Search Tree in wikipedia before reading the rest)
__init__ : This is how a constructor looks like in python. In c++ its class name itself (before 4 years).
So here defined 5 member variables : ch For storing the character the node is representing. For TST (ternary search tree) we need three pointer (You can think of C pointers concept). One for storing address/handle of node whose ch value is lesser than the what it stores and right for nodes storing address/handle of node whose ch value is greater than what it holds. And the center member hold the address of the node which store the character in the next level. Its little tricky lets say it like this left and right points to nodes where are is same level to the current node. We will come back to it after some time. And flag member variabe is for saying that this is the last character of a valid word, its a flag says that from here to top its a valid word.
So we created the skeleton of a node object, now we want to add methods to use TST.
Method to Add a node to the TST.
Three arguments to method they are : self (the object itself method is calling ), string (Word (or rest of word) add to TST), node (Currently where we are)
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
def Add(self,string,node): # Function to add a string
key = string[0] #Extract First character of the word (or rest of the word as its a recursive method)
if node == 0 : #
node = Node(key,0) #if node is a 0 means Create a new node with key and append it to TST
#if current node is not 0 means there is some character in node.ch, so according to
# the value in node.ch we have to move forward.
if key < node.ch :
node.left = node.Add(string,node.left) # if the key (or character to add) is less than and not
# equal to node.ch means we have to add the string to left of the the current node, you also note
#that we are passing string itself to the next recursive add operation because the key is not added
# in TST.
elif key > node.ch :
node.right = node.Add(string,node.right)
#same as node.left adding only difference is checking whether key is greater than or equal to node.ch
else :
if len(string) == 1 :
node.flag = 1
else : node.center = node.Add(string[1:],node.center)
# Adding the character to TST, if the if that was the last character of the string
# we can add set the flag to say that a valid word ends here.
# if its not then we have to create rest of tree and point the current node.center to the
# the next ultimate node, thats why we are returning the currently passing node
return node |
Method to search a string in TST- check whether a particular string is there or not
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
def simple(self,string): # Function to search a string in the Ternary Search Tree
temp = self #Assuming self is the head of the TST
i=0
while temp != 0 :
#Finding the correct path to search by comparing the ch and key
if (string[i] < temp.ch) : temp = temp.left;
elif(string[i] > temp.ch) : temp = temp.right;
#If found move in the center path, which is the next level (string index++)
else :
i=i+1
if(i == len(string)):
return temp.flag #If string ends here return the flag Value
# which determines whether its a valid word in TST
temp = temp.center #If not Move on the path
return 0 |
The next methods are for Auto completion tasks, like if we give string app as input, then function should return app, application, apple, applet etc(Provided all this words are in TST).
Method do a DFS in TST : spdfs
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
def spdfs(self,match): # DFS for Ternary Search Tree
if self.flag == 1 : # If its a flag then print it
print("Match : ",match)
if self.center == 0 and self.left == 0 and self.right == 0:
return #IF TST ends here return
if self.center != 0 :
self.center.spdfs(match + self.center.ch) #If center is not zero then search rest with
# new match = match + current character (self.center.ch) and call with self.center object
# We are printing the match if there is a flag = 1 in current node or self
# if self.right and self.left != 0 then we can search in that path also but remove last character
# from match because we can only select one character from one level (left - right - current are )
# in one level
if self.right != 0 :
self.right.spdfs(match[:-1]+self.right.ch)
if self.left != 0 :
self.left.spdfs(match[:-1]+self.left.ch) |
Driver search method this method is using spdfs to print auto completion candidates
Method search
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 |
def search(self,string,match):
# Function to implement Auto complete search
if len(string) > 0: # If there is more character to come go on
key = string[0] # key is first character - using to find the direction
if key < self.ch : #
if(self.left == 0):
print("No Match Found") #Checking key < self.ch and found that
return #self.left = 0 means our search ends here
# because its the only path we have
self.left.search(string,match) # if self.left !=0 continue our search
elif key > self.ch : #same as self.left case
if(self.right == 0):
print("Not Match Found")
return
self.right.search(string,match)
else : # If character is matching or equal
if len(string) == 1:
# if last sting and flag = 1 print match + self.ch
if self.flag == 1 : print("Match ",match+self.ch)
# if last character and self.center !=0 Means more words to come
# with this intial characters apply a dfs from there
if self.center != 0 :
self.center.spdfs(match+self.ch+self.center.ch)
return 1
# if characters are plenty there in string continue search(recursive call)
self.center.search(string[1:],match+key)
else : # if there is not more characters, means TST didn't have any
# string starting with given input
print("Invalid String")
return |
Lets add some code for Data populating and to use TST.
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
def fileparse(filename,node):
#Parse the Input Dict file and build the TST
fd = open(filename)
line = fd.readline().strip('\r\n')
while line !='':
node.Add(line,node)
line = fd.readline().strip('\r\n')
if __name__=='__main__':
root = Node('',0)
# Give the Path to the Dictionary File in
Path_to_dict = "test.txt"
fileparse(Path_to_dict,root)
inp = ''
while inp !='q':
inp = raw_input("Enter String : ",)
root.search(inp,'') |
Ternary search tree can be consider as Less memory equivalent Trie and not speedy as Trie.
Whole Code here without comments (Oh!! Thanks god
):
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 |
class Node:
def __init__(self,ch,flag): # Constructor for Node Object
self.ch = ch
self.flag = flag
self.left = 0
self.right = 0
self.center = 0
def Add(self,string,node): # Function to add a string
key = string[0]
if node == 0 :
node = Node(key,0)
if key < node.ch :
node.left = node.Add(string,node.left)
elif key > node.ch :
node.right = node.Add(string,node.right)
else :
if len(string) == 1 :
node.flag = 1
else : node.center = node.Add(string[1:],node.center)
return node
def spdfs(self,match): # DFS for Ternary Search Tree
if self.flag == 1 :
print("Match : ",match)
if self.center == 0 and self.left == 0 and self.right == 0:
return
if self.center != 0 :
self.center.spdfs(match + self.center.ch)
if self.right != 0 :
self.right.spdfs(match[:-1]+self.right.ch)
if self.left != 0 :
self.left.spdfs(match[:-1]+self.left.ch)
def simple(self,string): # Function to search a string in the Ternary Search Tree
temp = self
i=0
while temp != 0 :
if (string[i] < temp.ch) : temp = temp.left;
elif(string[i] > temp.ch) : temp = temp.right;
else :
i=i+1
if(i == len(string)):
return temp.flag
temp = temp.center
return 0
def search(self,string,match):
# Function to implement Auto complete search
if len(string) > 0:
key = string[0]
if key < self.ch :
if(self.left == 0):
print("No Match Found")
return
self.left.search(string,match)
elif key > self.ch :
if(self.right == 0):
print("Not Match Found")
return
self.right.search(string,match)
else :
if len(string) == 1:
if self.flag == 1 : print("Match ",match+self.ch)
if self.center != 0 :
self.center.spdfs(match+self.ch+self.center.ch)
return 1
self.center.search(string[1:],match+key)
else :
print("Invalid String")
return
def fileparse(filename,node):
#Parse the Input Dict file and build the TST
fd = open(filename)
line = fd.readline().strip('\r\n')
while line !='':
node.Add(line,node)
line = fd.readline().strip('\r\n')
if __name__=='__main__':
root = Node('',0)
# Give the Path to the Dictionary File in
Path_to_dict = "test.txt"
fileparse(Path_to_dict,root)
inp = ''
while inp !='q':
inp = raw_input("Enter String : ",)
root.search(inp,'') |