Welcome!
This is the community forum for my apps Pythonista and Editorial.
For individual support questions, you can also send an email. If you have a very short question or just want to say hello — I'm @olemoritz on Twitter.
Problem when writing to text file.
-
Hello i was working through a collective intelligence book and its supposed to write to a text file called blogdata.txt after processing certain info within feedlist.txt. However,blogdata.txt is never created. The code doesnt give me an error or anything. But when i hit run it runs for a while but then nothing happens. Here is the code.
import feedparser import re # Returns title and dictionary of word counts for an RSS feed def getwordcounts(url): # Parse the feed d=feedparser.parse(url) wc={} # Loop over all the entries for e in d.entries: if 'summary' in e: summary=e.summary else: summary=e.description # Extract a list of words words=getwords(e.title+' '+summary) for word in words: wc.setdefault(word,0) wc[word]+=1 return d.feed.title,wc def getwords(html): # Remove all the HTML tags txt=re.compile(r'<[^>]+>').sub('',html) # Split words by all non-alpha characters words=re.compile(r'[^A-Z^a-z]+').split(txt) # Convert to lowercase return [word.lower() for word in words if word!=''] apcount={} wordcounts={} for feedurl in file('feedlist.txt'): title,wc=getwordcounts(feedurl) wordcounts[title]=wc for word,count in wc.items(): apcount.setdefault(word,0) if count>1: apcount[word]+=1 wordlist=[] for w,bc in apcount.items(): frac=float(bc)/len(feedlist) if frac>0.1 and frac<0.5: wordlist.append(w) out=file('blogdata.txt','w') out.write('Blog') for word in wordlist: out.write('\t%s' % word) out.write('\n') for blog,wc in wordcounts.items(): out.write(blog) for word in wordlist: if word in wc: out.write('\t%d' % wc[word]) else: out.write('\t0') out.write('\n')
Now here is feelist.txt
http://feeds.feedburner.com/37signals/beMH
http://feeds.feedburner.com/blogspot/bRuz
http://battellemedia.com/index.xml
http://blog.guykawasaki.com/index.rdf
http://blog.outer-court.com/rss.xml
http://feeds.searchenginewatch.com/sewblog
http://blog.topix.net/index.rdf
http://blogs.abcnews.com/theblotter/index.rdf
http://feeds.feedburner.com/ConsumingExperienceFull
http://flagrantdisregard.com/index.php/feed/
http://featured.gigaom.com/feed/
http://gizmodo.com/index.xml
http://gofugyourself.typepad.com/go_fug_yourself/index.rdf
http://googleblog.blogspot.com/rss.xml
http://feeds.feedburner.com/GoogleOperatingSystem
http://headrush.typepad.com/creating_passionate_users/index.rdf
http://feeds.feedburner.com/instapundit/main
http://jeremy.zawodny.com/blog/rss2.xml
http://joi.ito.com/index.rdf
http://feeds.feedburner.com/Mashable
http://michellemalkin.com/index.rdf
http://moblogsmoproblems.blogspot.com/rss.xml
http://newsbusters.org/node/feed
http://beta.blogger.com/feeds/27154654/posts/full?alt=rss
http://feeds.feedburner.com/paulstamatiou
http://powerlineblog.com/index.rdf
http://feeds.feedburner.com/Publishing20
http://radar.oreilly.com/index.rdf
http://scienceblogs.com/pharyngula/index.xml
http://scobleizer.wordpress.com/feed/
http://sethgodin.typepad.com/seths_blog/index.rdf
http://rss.slashdot.org/Slashdot/slashdot
http://thinkprogress.org/feed/
http://feeds.feedburner.com/andrewsullivan/rApM
http://wilwheaton.typepad.com/wwdnbackup/index.rdf
http://www.43folders.com/feed/
http://www.456bereastreet.com/feed.xml
http://www.autoblog.com/rss.xml
http://www.bloggersblog.com/rss.xml
http://www.bloglines.com/rss/about/news
http://www.blogmaverick.com/rss.xml
http://www.boingboing.net/index.rdf
http://www.buzzmachine.com/index.xml
http://www.captainsquartersblog.com/mt/index.rdf
http://www.coolhunting.com/index.rdf
http://feeds.copyblogger.com/Copyblogger
http://feeds.feedburner.com/crooksandliars/YaCP
http://feeds.dailykos.com/dailykos/index.xml
http://www.deadspin.com/index.xml
http://www.downloadsquad.com/rss.xml
http://www.engadget.com/rss.xml
http://www.gapingvoid.com/index.rdf
http://www.gawker.com/index.xml
http://www.gothamist.com/index.rdf
http://www.huffingtonpost.com/raw_feed_index.rdf
http://www.hyperorg.com/blogger/index.rdf
http://www.joelonsoftware.com/rss.xml
http://www.joystiq.com/rss.xml
http://www.kotaku.com/index.xml
http://feeds.kottke.org/main
http://www.lifehack.org/feed/
http://www.lifehacker.com/index.xml
http://littlegreenfootballs.com/weblog/lgf-rss.php
http://www.makezine.com/blog/index.xml
http://www.mattcutts.com/blog/feed/
http://xml.metafilter.com/rss.xml
http://www.mezzoblue.com/rss/index.xml
http://www.micropersuasion.com/index.rdf
http://www.neilgaiman.com/journal/feed/rss.xml
http://www.oilman.ca/feed/
http://www.perezhilton.com/index.xml
http://www.plasticbag.org/index.rdf
http://www.powazek.com/rss.xml
http://www.problogger.net/feed/
http://feeds.feedburner.com/QuickOnlineTips
http://www.readwriteweb.com/rss.xml
http://www.schneier.com/blog/index.rdf
http://scienceblogs.com/sample/combined.xml
http://www.seroundtable.com/index.rdf
http://www.shoemoney.com/feed/
http://www.sifry.com/alerts/index.rdf
http://www.simplebits.com/xml/rss.xml
http://feeds.feedburner.com/Spikedhumor
http://www.stevepavlina.com/blog/feed
http://www.talkingpointsmemo.com/index.xml
http://www.tbray.org/ongoing/ongoing.rss
http://feeds.feedburner.com/TechCrunch
http://www.techdirt.com/techdirt_rss.xml
http://www.techeblog.com/index.php/feed/
http://www.thesuperficial.com/index.xml
http://www.tmz.com/rss.xml
http://www.treehugger.com/index.rdf
http://www.tuaw.com/rss.xml
http://www.valleywag.com/index.xml
http://www.we-make-money-not-art.com/index.rdf
http://www.wired.com/rss/index.xml
http://www.wonkette.com/index.xmlCan anyone tell me whats wrong? Thank you so much for your help.
-
Make sure to close the
out
file at the end:#... out.close()
(not entirely sure if this is the problem here)
-
From http://docs.python.org/2/library/functions.html?highlight=file#file
When opening a file, it’s preferable to use open() instead of invoking this constructor directly. file is more suited to type testing (for example, writing isinstance(f, file)).
In addition to using open() instead of file(), I would strongly encourage the use of the syntax:
with open('feedlist.txt') as in_file: for feedurl in in_file.read():
With this syntax, in_file.close() will automatically be called when the with block terminates even if exceptions are thrown. File handles left open lead to bugs and memory leaks. The with syntax allows you to open, use, and forget because the close() call is automatic.