Python: mapping the content of a structured text file to dictionary tree
I'm looking for a method to map the content of a structured text file to a nested dictionary (dictionary tree). The text file consists of (nested) sections with each section starting with the pattern Begin $KEYWORD and ending with the pattern End $KEYWORD. An example could look like this: Begin Section1 test1 End Section1 Begin Section2 test3 Begin Section3 test1 test2 End Section3 End Section2 I want to access the text lines corresponding to a specific section by reading the value of key "text" from a (nested) dictionary. E.g., in the example above print(sect["Section2"]["Section3"]["text"]) should produce the output ['test1', 'test2'] where sect denotes the nested dictionary. My naive coding attempt produced this: testtxt = """ Begin Section1 test1 End Section1 Begin Section2 test3 Begin Section3 test1 test2 End Section3 End Section2 """ testtxt = list(filter(None, testtxt.split('\n'))) # root node sect = dict() sect["text"] = [] # save all nodes from current node all the way up # to the root node in a list stack = [] for line in testtxt: if line.startswith("Begin "): # section begins with line "Begin " key_word = line.split("Begin ")[1].rstrip() sect[key_word] = dict() sect[key_word]["text"] = [] # save parent node to stack in order to be able to back up to parent node stack.append(sect) # walk down to child node sect = sect[key_word] elif line.startswith("End "): # section ends with line "End " # back up to parent node sect = stack[-1] stack.pop(-1) else: # assumption: string "text" is not used as keyword sect["text"].append(line) which does what I want, but it looks kind of "unpythonic". The step from parent to child node is simply sect = sect[key_word]; however, for the return path from the child up to the parent node I had to resort to the list stack, which contains all nodes from the root down to the current child's parent node. When an End KEYWORD line is found, the current node is set to the parent node taken from list stack and the corresponding entry is cleared from the list. I'd be grateful for suggestions on how to access the parent from the child node in a more elegant way (without using function recursion).
I'm looking for a method to map the content of a structured text file to a nested dictionary (dictionary tree). The text file consists of (nested) sections with each section starting with the pattern Begin $KEYWORD
and ending with the pattern End $KEYWORD
. An example could look like this:
Begin Section1
test1
End Section1
Begin Section2
test3
Begin Section3
test1
test2
End Section3
End Section2
I want to access the text lines corresponding to a specific section by reading the value of key "text" from a (nested) dictionary.
E.g., in the example above print(sect["Section2"]["Section3"]["text"])
should produce the output ['test1', 'test2']
where sect
denotes the nested dictionary. My naive coding attempt produced this:
testtxt = """
Begin Section1
test1
End Section1
Begin Section2
test3
Begin Section3
test1
test2
End Section3
End Section2
"""
testtxt = list(filter(None, testtxt.split('\n')))
# root node
sect = dict()
sect["text"] = []
# save all nodes from current node all the way up
# to the root node in a list
stack = []
for line in testtxt:
if line.startswith("Begin "):
# section begins with line "Begin "
key_word = line.split("Begin ")[1].rstrip()
sect[key_word] = dict()
sect[key_word]["text"] = []
# save parent node to stack in order to be able to back up to parent node
stack.append(sect)
# walk down to child node
sect = sect[key_word]
elif line.startswith("End "):
# section ends with line "End "
# back up to parent node
sect = stack[-1]
stack.pop(-1)
else:
# assumption: string "text" is not used as keyword
sect["text"].append(line)
which does what I want, but it looks kind of "unpythonic". The step from parent to child node is simply sect = sect[key_word]
; however, for the return path from the child up to the parent node I had to resort to the list stack
, which contains all nodes from the root down to the current child's parent node. When an End KEYWORD
line is found, the current node is set to the parent node taken from list stack
and the corresponding entry is cleared from the list. I'd be grateful for suggestions on how to access the parent from the child node in a more elegant way (without using function recursion).