performance - Python dict from mobypos.txt file -

- March 15, 2012

i have file moby project pairs words 1 or more letters indicating part of speech. example:

hemoglobin\n hemogram\n hemoid\a hemolysin\n hemolysis\n hemolytic\a hemophile\na hemophiliac\n

hemoglobin noun, hemoid adjective, , hemophile can used noun or adjective.

i have created dict file pairs word letters indicating part of speech using following code:

mm = open("mobypos.txt").readlines() pairs = [] x in mm:     pairs.append(x.split("\\")) posdict = dict(pairs)

this works successfully. want generate lists called nouns, verbs, adjectives, etc contain words of part of speech. fastest way this, given len(posdict.keys()) returns 233340

you can use list comprehension

nouns = [word, type in posdict.iteritems() if 'n' in type]  adjs = [word, type in posdict.iteritems() if 'a' in type]  verbs = [word, type in posdict.iteritems() if 'v' in type]

the use of in operator in if clause place words multiple types accordingly.

Search This Blog

Dil

performance - Python dict from mobypos.txt file -

Comments

Post a Comment

Popular posts from this blog

c# - Store DBContext Log in other EF table -

c# - Display ASPX Popup control in RowDeleteing Event (ASPX Gridview) -

javascript - Karma not able to start PhantomJS on Windows - Error: spawn UNKNOWN -