The Shed is going Social! Join us on FaceBook and Twitter and chime in on the conversation.
|
 |
|
Dev Shed Forums
> Programming Languages
> Python Programming
|
Write Corpus To Text File?
Discuss Write Corpus To Text File? in the Python Programming forum on Dev Shed. Write Corpus To Text File? Python Programming forum discussing coding techniques, tips and tricks, and Zope related information. Python was designed from the ground up to be a completely object-oriented programming language.
|
|
 |
|
|
|
|

Dev Shed Forums Sponsor:
|
|
|

November 24th, 2012, 06:59 PM
|
|
Registered User
|
|
Join Date: Nov 2012
Posts: 1
Time spent in forums: 12 m 55 sec
Reputation Power: 0
|
|
Write Corpus To Text File?
Hello: Just wanted to see if anyone had tips on writing the "editorial section" of the brown corpus from nltk to a plain text file:
I have: but that's not working out
Code:
Original
- Code |
|
|
|
import nltk
from nltk.corpus import brown
textfile = file("Output.txt", "w")
for i in range(len(brown)):
textfile.write(brown[i])
textfile.write(" ")
|

November 24th, 2012, 08:12 PM
|
 |
Contributing User
|
|
|
|
Have you read the interactive nltk book?
Did you download the Brown corpus?
>>> nltk.download()
That's quite a lot to learn before one figures out why your program doesn't work when you won't tell what error you received.
length error because lazy loader didn't yet load the text?
Other?
__________________
[code] Code tags[/code] are essential for python code!
|

November 24th, 2012, 09:32 PM
|
|
Registered User
|
|
Join Date: Nov 2012
Posts: 2
Time spent in forums: 24 m 26 sec
Reputation Power: 0
|
|
|
OP here, doing some email shuffling and this will be my new permanent account. Anyways, I can't quote you because it contains a "URL" that I can't use because this account is new but the error that I get with that snippet is:
"TypeError: object of type 'LazyCorpusLoader' has no len()"
Confused as to why that would be?
I have nltk downloaded with the corpus I need
|

November 25th, 2012, 10:55 AM
|
 |
Contributing User
|
|
|
|
Try the open method of Brown.
Code:
>>> import nltk
>>> from nltk.corpus import brown
>>> help(brown.open)
Help on method open in module nltk.corpus.reader.api:
open(self, file, sourced=False) method of nltk.corpus.reader.tagged.CategorizedTaggedCorpusReader instance
Return an open stream that can be used to read the given file.
If the file's encoding is not C{None}, then the stream will
automatically decode the file's contents into unicode.
@param file: The file identifier of the file to read.
>>>
|

November 25th, 2012, 02:16 PM
|
|
Registered User
|
|
Join Date: Nov 2012
Posts: 2
Time spent in forums: 24 m 26 sec
Reputation Power: 0
|
|
I used brown.open and tried the same code. However this time I was met with the "'CategorizedTaggedCorpusReader' has no len()" error.
Should I write the code a different way completely?
|

November 25th, 2012, 04:12 PM
|
 |
Contributing User
|
|
|
|
You should read the friggin' manual.
Code:
>>> from nltk.corpus import brown
>>> brown.fileids()[:4]
['ca01', 'ca02', 'ca03', 'ca04']
>>> inf = brown.open('ca01')
>>> import pprint
>>> pprint.pprint(inf.readlines(4))
['\n',
'\n',
"\tThe/at Fulton/np-tl County/nn-tl Grand/jj-tl Jury/nn-tl said/vbd Friday/nr an/at investigation/nn of/in Atlanta's/np$ recent/jj primary/nn election/nn produced/vbd ``/`` no/at evidence/nn ''/'' that/cs any/dti irregularities/nns took/vbd place/nn ./.\n",
'\n',
'\n',
"\tThe/at jury/nn further/rbr said/vbd in/in term-end/nn presentments/nns that/cs the/at City/nn-tl Executive/jj-tl Committee/nn-tl ,/, which/wdt had/hvd over-all/jj charge/nn of/in the/at election/nn ,/, ``/`` deserves/vbz the/at praise/nn and/cc thanks/nns of/in the/at City/nn-tl of/in-tl Atlanta/np-tl ''/'' for/in the/at manner/nn in/in which/wdt the/at election/nn was/bedz conducted/vbn ./.\n",
'\n',
'\n',
"\tThe/at September-October/np term/nn jury/nn had/hvd been/ben charged/vbn by/in Fulton/np-tl Superior/jj-tl Court/nn-tl Judge/nn-tl Durwood/np Pye/np to/to investigate/vb reports/nns of/in possible/jj ``/`` irregularities/nns ''/'' in/in the/at hard-fought/jj primary/nn which/wdt was/bedz won/vbn by/in Mayor-nominate/nn-tl Ivan/np Allen/np Jr./np ./.\n",
'\n',
'\n',
"\t``/`` Only/rb a/at relative/jj handful/nn of/in such/jj reports/nns was/bedz received/vbn ''/'' ,/, the/at jury/nn said/vbd ,/, ``/`` considering/in the/at widespread/jj interest/nn in/in the/at election/nn ,/, the/at number/nn of/in voters/nns and/cc the/at size/nn of/in this/dt city/nn ''/'' ./.\n",
'\n',
'\n',
"\tThe/at jury/nn said/vbd it/pps did/dod find/vb that/cs many/ap of/in Georgia's/np$ registration/nn and/cc election/nn laws/nns ``/`` are/ber outmoded/jj or/cc inadequate/jj and/cc often/rb ambiguous/jj ''/'' ./.\n",
'\n',
'\n',
"\tIt/pps recommended/vbd that/cs Fulton/np legislators/nns act/vb ``/`` to/to have/hv these/dts laws/nns studied/vbn and/cc revised/vbn to/in the/at end/nn of/in modernizing/vbg and/cc improving/vbg them/ppo ''/'' ./.\n",
'\n',
'\n',
"\tThe/at grand/jj jury/nn commented/vbd on/in a/at number/nn of/in other/ap topics/nns ,/, among/in them/ppo the/at Atlanta/np and/cc Fulton/np-tl County/nn-tl purchasing/vbg departments/nns which/wdt it/pps said/vbd ``/`` are/ber well/ql operated/vbn and/cc follow/vb generally/rb accepted/vbn practices/nns which/wdt inure/vb to/in the/at best/jjt interest/nn of/in both/abx governments/nns ''/'' ./.\n",
'\n',
'\n',
'\n',
'Merger/nn-hl proposed/vbn-hl \n',
"However/wrb ,/, the/at jury/nn said/vbd it/pps believes/vbz ``/`` these/dts two/cd offices/nns should/md be/be combined/vbn to/to achieve/vb greater/jjr efficiency/nn and/cc reduce/vb the/at cost/nn of/in administration/nn ''/'' ./.\n",
'\n',
'\n',
"\tThe/at City/nn-tl Purchasing/vbg-tl Department/nn-tl ,/, the/at jury/nn said/vbd ,/, ``/`` is/bez lacking/vbg in/in experienced/vbn clerical/jj personnel/nns as/cs a/at result/nn of/in city/nn personnel/nns policies/nns ''/'' ./.\n",
"It/pps urged/vbd that/cs the/at city/nn ``/`` take/vb steps/nns to/to remedy/vb ''/'' this/dt problem/nn ./.\n",
'\n',
'\n',
"\tImplementation/nn of/in Georgia's/np$ automobile/nn title/nn law/nn was/bedz also/rb recommended/vbn by/in the/at outgoing/jj jury/nn ./.\n",
'\n',
'\n',
"\tIt/pps urged/vbd that/cs the/at next/ap Legislature/nn-tl ``/`` provide/vb enabling/vbg funds/nns and/cc re-set/vb the/at effective/jj date/nn so/cs that/cs an/at orderly/jj implementation/nn of/in the/at law/nn may/md be/be effected/vbn ''/'' ./.\n",
'\n',
'\n',
"\tThe/at grand/jj jury/nn took/vbd a/at swipe/nn at/in the/at State/nn-tl Welfare/nn-tl Department's/nn$-tl handling/nn of/in federal/jj funds/nns granted/vbn for/in child/nn welfare/nn services/nns in/in foster/jj homes/nns ./.\n",
'\n',
'\n',
"\t``/`` This/dt is/bez one/cd of/in the/at major/jj items/nns in/in the/at Fulton/np-tl County/nn-tl general/jj assistance/nn program/nn ''/'' ,/, the/at jury/nn said/vbd ,/, but/cc the/at State/nn-tl Welfare/nn-tl Department/nn-tl ``/`` has/hvz seen/vbn fit/jj to/to distribute/vb these/dts funds/nns through/in the/at welfare/nn departments/nns of/in all/abn the/at counties/nns in/in the/at state/nn with/in the/at exception/nn of/in Fulton/np-tl County/nn-tl ,/, which/wdt receives/vbz none/pn of/in this/dt money/nn ./.\n",
'\n',
'\n',
"\tThe/at jurors/nns said/vbd they/ppss realize/vb ``/`` a/at proportionate/jj distribution/nn of/in these/dts funds/nns might/md disable/vb this/dt program/nn in/in our/pp$ less/ql populous/jj counties/nns ''/'' ./.\n",
'\n',
'\n',
"\tNevertheless/rb ,/, ``/`` we/ppss feel/vb that/cs in/in the/at future/nn Fulton/np-tl County/nn-tl should/md receive/vb some/dti portion/nn of/in these/dts available/jj funds/nns ''/'' ,/, the/at jurors/nns said/vbd ./.\n",
"``/`` Failure/nn to/to do/do this/dt will/md continue/vb to/to place/vb a/at disproportionate/jj burden/nn ''/'' on/in Fulton/np taxpayers/nns ./.\n",
'\n',
'\n',
"\tThe/at jury/nn also/rb commented/vbd on/in the/at Fulton/np ordinary's/nn$ court/nn which/wdt has/hvz been/ben under/in fire/nn for/in its/pp$ practices/nns in/in the/at appointment/nn of/in appraisers/nns ,/, guardians/nns and/cc administrators/nns and/cc the/at awarding/nn of/in fees/nns and/cc compensation/nn ./.\n",
'\n',
'\n',
'\n',
'Wards/nns-hl protected/vbn-hl \n',
"The/at jury/nn said/vbd it/pps found/vbd the/at court/nn ``/`` has/hvz incorporated/vbn into/in its/pp$ operating/vbg procedures/nns the/at recommendations/nns ''/'' of/in two/cd previous/jj grand/jj juries/nns ,/, the/at Atlanta/np-tl Bar/nn-tl Association/nn-tl and/cc an/at interim/nn citizens/nns committee/nn ./.\n",
'\n',
'\n',
"\t``/`` These/dts actions/nns should/md serve/vb to/to protect/vb in/in fact/nn and/cc in/in effect/nn the/at court's/nn$ wards/nns from/in undue/jj costs/nns and/cc its/pp$ appointed/vbn and/cc elected/vbn servants/nns from/in unmeritorious/jj criticisms/nns ''/'' ,/, the/at jury/nn said/vbd ./.\n",
'\n',
'\n',
"\tRegarding/in Atlanta's/np$ new/jj multi-million-dollar/jj airport/nn ,/, the/at jury/nn recommended/vbd ``/`` that/cs when/wrb the/at new/jj management/nn takes/vbz charge/nn Jan./np 1/cd the/at airport/nn be/be operated/vbn in/in a/at manner/nn that/wps will/md eliminate/vb political/jj influences/nns ''/'' ./.\n",
'\n',
'\n',
"\tThe/at jury/nn did/dod not/* elaborate/vb ,/, but/cc it/pps added/vbd that/cs ``/`` there/ex should/md be/be periodic/jj surveillance/nn of/in the/at pricing/vbg practices/nns of/in the/at concessionaires/nns for/in the/at purpose/nn of/in keeping/vbg the/at prices/nns reasonable/jj ''/'' ./.\n",
'\n',
'\n',
'\n',
'Ask/vb-hl jail/nn-hl deputies/nns-hl \n',
'On/in other/ap matters/nns ,/, the/at jury/nn recommended/vbd that/cs :/: (/( 1/cd )/) \n',
"Four/cd additional/jj deputies/nns be/be employed/vbn at/in the/at Fulton/np-tl County/nn-tl Jail/nn-tl and/cc ``/`` a/at doctor/nn ,/, medical/jj intern/nn or/cc extern/nn be/be employed/vbn for/in night/nn and/cc weekend/nn duty/nn at/in the/at jail/nn ''/'' ./.\n",
'(/( 2/cd )/) \n',
"Fulton/np legislators/nns ``/`` work/vb with/in city/nn officials/nns to/to pass/vb enabling/vbg legislation/nn that/wps will/md permit/vb the/at establishment/nn of/in a/at fair/jj and/cc equitable/jj ''/'' pension/nn plan/nn for/in city/nn employes/nns ./.\n",
'\n',
'\n',
"\tThe/at jury/nn praised/vbd the/at administration/nn and/cc operation/nn of/in the/at Atlanta/np-tl Police/nns-tl Department/nn-tl ,/, the/at Fulton/np-tl Tax/nn-tl Commissioner's/nn$-tl Office/nn-tl ,/, the/at Bellwood/np and/cc Alpharetta/np prison/nn farms/nns ,/, Grady/np-tl Hospital/nn-tl and/cc the/at Fulton/np-tl Health/nn-tl Department/nn-tl ./.\n",
'\n',
'\n',
'\tMayor/nn-tl William/np B./np Hartsfield/np filed/vbd suit/nn for/in divorce/nn from/in his/pp$ wife/nn ,/, Pearl/np Williams/np Hartsfield/np ,/, in/in Fulton/np-tl Superior/jj-tl Court/nn-tl Friday/nr ./.\n',
'His/pp$ petition/nn charged/vbd mental/jj cruelty/nn ./.\n',
'\n',
'\n',
'\tThe/at couple/nn was/bedz married/vbn Aug./np 2/cd ,/, 1913/cd ./.\n',
'They/ppss have/hv a/at son/nn ,/, William/np Berry/np Jr./np ,/, and/cc a/at daughter/nn ,/, Mrs./np J./np M./np Cheshire/np of/in Griffin/np ./.\n',
'\n',
'\n',
'\tAttorneys/nns for/in the/at mayor/nn said/vbd that/cs an/at amicable/jj property/nn settlement/nn has/hvz been/ben agreed/vbn upon/rb ./.\n',
'\n',
'\n',
"\tThe/at petition/nn listed/vbd the/at mayor's/nn$ occupation/nn as/cs ``/`` attorney/nn ''/'' and/cc his/pp$ age/nn as/cs 71/cd ./.\n",
"It/pps listed/vbd his/pp$ wife's/nn$ age/nn as/cs 74/cd and/cc place/nn of/in birth/nn as/cs Opelika/np ,/, Ala./np ./.\n",
'\n',
'\n',
'\tThe/at petition/nn said/vbd that/cs the/at couple/nn has/hvz not/* lived/vbn together/rb as/cs man/nn and/cc wife/nn for/in more/ap than/in a/at year/nn ./.\n',
'\n',
'\n',
'\tThe/at Hartsfield/np home/nr is/bez at/in 637/cd E./np Pelham/np Rd./nn-tl Aj/nn ./.\n']
|

November 27th, 2012, 06:11 PM
|
 |
Contributing User
|
|
|
|
Here is some help:
Code:
'''nltk_brown_sentences.py
show word lists of brown sentences
installed
nltk-2.0.4.win32.exe
from
http://nltk.org/install.html
and
PyYAML-3.10.win32-py2.7.exe
from
http://pyyaml.org/wiki/PyYAML
then run
nltk.download()
to install data
tested with Python 2.7.3
'''
from nltk.corpus import brown
sentence1 = brown.sents()[0]
print(sentence1)
print('-'*70)
'''
['The', 'Fulton', 'County', 'Grand', 'Jury', 'said', 'Friday', 'an',
'investigation', 'of', "Atlanta's", 'recent', 'primary', 'election',
'produced', '``', 'no', 'evidence', "''", 'that', 'any', 'irregularities',
'took', 'place', '.']
'''
sentence2 = brown.sents()[1]
print(sentence2)
'''
['The', 'jury', 'further', 'said', 'in', 'term-end', 'presentments',
'that', 'the', 'City', 'Executive', 'Committee', ',', 'which', 'had',
'over-all', 'charge', 'of', 'the', 'election', ',', '``', 'deserves',
'the', 'praise', 'and', 'thanks', 'of', 'the', 'City', 'of', 'Atlanta',
"''", 'for', 'the', 'manner', 'in', 'which', 'the', 'election', 'was',
'conducted', '.']
'''
__________________
Real Programmers always confuse Christmas and Halloween because Oct31 == Dec25
|
Developer Shed Advertisers and Affiliates
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Rate This Thread |
Linear Mode
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|