performance - Python dict from mobypos.txt file -
i have file moby project pairs words 1 or more letters indicating part of speech. example:
hemoglobin\n hemogram\n hemoid\a hemolysin\n hemolysis\n hemolytic\a hemophile\na hemophiliac\n
hemoglobin noun, hemoid adjective, , hemophile can used noun or adjective.
i have created dict file pairs word letters indicating part of speech using following code:
mm = open("mobypos.txt").readlines() pairs = [] x in mm: pairs.append(x.split("\\")) posdict = dict(pairs)
this works successfully. want generate lists called nouns
, verbs
, adjectives
, etc contain words of part of speech. fastest way this, given len(posdict.keys())
returns 233340
you can use list comprehension
nouns = [word, type in posdict.iteritems() if 'n' in type] adjs = [word, type in posdict.iteritems() if 'a' in type] verbs = [word, type in posdict.iteritems() if 'v' in type]
the use of in
operator in if
clause place words multiple types accordingly.
Comments
Post a Comment