Archive for the 'Uncategorized' Category

Playing with Wolfram Alpha and Python

I love the idea of Wolfram Alpha. I haven’t used it enough to tell how the reality of it compares. Mostly what I’ve seen is easter-eggs, which are fun, but I wanted to see if I could do something more substantive.

An article I’d seen in the news lately piqued my interest about the number of PhDs by US State. It took a goodly amount of fiddling before I figured out I could get the answer by submitting this query:

‘How many phds in California?’

Take a look at what this gives you: http://www.wolframalpha.com/input/?i=how+many+phds+in+california%3F

Lots of great information, including the specific answer I want, 331,101.

If I want to do this for all fifty states, I’m going to need to use the API not the interface. I registered for an API account which was quick and easy and checked out the bindings available on their site. The bindings left a lot to be desired, so I decided to just use urllib2.

Lately I can’t abide dealing directly with XML, so I found this nice library xmltodict. I’m sure [insert xml parsing technique] would work just as well or better.

As I was working I realized I also wanted to pull in total population, total adult population, etc. so I could talk about percentages.

I did this all in an IPython Notebook (if you’re not using this, you need to start, it’s totally awesome. Check out http://ipython.org/ipython-doc/dev/interactive/htmlnotebook.html).

Here is my notebook (created using Python 2.7): https://gist.github.com/4052828

Here is a rendered version of the notebook: http://nbviewer.ipython.org/4052828/

All in all it was a fun exercise but it still felt like page scraping. The one advantage is that you can ask the same question with slight modifications easily (e.g., how many people in California vs how many adults in Idaho) and get back essentially the same response so that’s handy. And although Wolfram Alpha returns a machine readable data structure (XML), it’s not exactly richly semantically tagged. There are plain text bits that have to be parsed. For example, sometimes a population will be expressed as its raw number, sometimes as 1.2 million. So I had to put special handling in my code for that case. It would be nice if such quantities were available with no plain-text parsing required.

One other adjustment that would be good would be to thread-out the calls to the API. It takes a good amount of time to process all of these serially and there’s no reason they couldn’t be threaded. I’ll definitely take a look at doing this.

If you’re just interested in results, here you go. This is the percentage of adults with PhDs and percentage of adults with at least an associates degree ranked by state from highest to lowest.

EDIT: After writing this, I found http://pypi.python.org/pypi/wolframalpha/1.0. This looks to be a nicer wrapper, I’ll have to take a look and see if it works as advertised. Let me know if you’ve used it successfully.

----------------------------------------------------------------------------------------------------
phds
#1: Washington DC: 3.68%
#2: Maryland: 2.32%
#3: Massachusetts: 2.30%
#4: New Mexico: 1.75%
#5: Vermont: 1.66%
#6: Connecticut: 1.60%
#7: Delaware: 1.56%
#8: Virginia: 1.54%
#9: California: 1.42%
#10: New Jersey: 1.41%
#11: Rhode Island: 1.39%
#12: Colorado: 1.37%
#13: Oregon: 1.35%
#14: Washington State: 1.34%
#15: Pennsylvania: 1.31%
#16: New York: 1.30%
#17: Hawaii: 1.25%
#18: New Hampshire: 1.24%
#19: Arizona: 1.18%
#20: Utah: 1.17%
#21: Minnesota: 1.16%
#22: Montana: 1.14%
#23: North Carolina: 1.13%
#24: Illinois: 1.13%
#25: Nebraska: 1.12%
#26: Maine: 1.12%
#27: Florida: 1.10%
#28: Alaska: 1.10%
#29: Kansas: 1.08%
#30: Tennessee: 1.05%
#31: Missouri: 1.05%
#32: Georgia: 1.04%
#33: Idaho: 1.02%
#34: Wyoming: 1.02%
#35: Iowa: 1.02%
#36: Wisconsin: 0.99%
#37: Michigan: 0.98%
#38: South Dakota: 0.97%
#39: Ohio: 0.97%
#40: Texas: 0.94%
#41: South Carolina: 0.93%
#42: Alabama: 0.92%
#43: Indiana: 0.91%
#44: North Dakota: 0.87%
#45: Mississippi: 0.82%
#46: Oklahoma: 0.81%
#47: Louisiana: 0.81%
#48: Kentucky: 0.80%
#49: Nevada: 0.78%
#50: Arkansas: 0.77%
#51: West Virginia: 0.77%
----------------------------------------------------------------------------------------------------
college graduates
#1: Washington DC: 50.47%
#2: Massachusetts: 48.28%
#3: Connecticut: 45.71%
#4: New Hampshire: 44.77%
#5: Colorado: 44.07%
#6: Vermont: 43.98%
#7: New Jersey: 43.76%
#8: Maryland: 43.53%
#9: Minnesota: 42.94%
#10: Hawaii: 41.90%
#11: Washington State: 41.71%
#12: Virginia: 41.58%
#13: New York: 40.38%
#14: Rhode Island: 39.72%
#15: North Dakota: 39.42%
#16: Maine: 39.06%
#17: Illinois: 39.05%
#18: Oregon: 39.04%
#19: Florida: 38.78%
#20: Nebraska: 38.50%
#21: Montana: 38.27%
#22: Kansas: 38.26%
#23: California: 38.13%
#24: Delaware: 37.30%
#25: South Dakota: 37.15%
#26: Iowa: 36.75%
#27: Wisconsin: 36.73%
#28: Pennsylvania: 36.66%
#29: Utah: 36.56%
#30: Arizona: 36.26%
#31: North Carolina: 35.96%
#32: Michigan: 34.98%
#33: Wyoming: 34.34%
#34: New Mexico: 34.09%
#35: Georgia: 33.89%
#36: South Carolina: 33.78%
#37: Idaho: 33.74%
#38: Ohio: 33.65%
#39: Missouri: 33.54%
#40: Alaska: 33.04%
#41: Texas: 31.96%
#42: Indiana: 31.04%
#43: Oklahoma: 30.72%
#44: Tennessee: 30.29%
#45: Alabama: 30.18%
#46: Nevada: 30.18%
#47: Kentucky: 28.44%
#48: Mississippi: 27.99%
#49: Arkansas: 26.74%
#50: Louisiana: 26.32%
#51: West Virginia: 25.49%
Advertisement

Parsing sentences with Link Grammar and Python

This is my PyCon 2012 talk on parsing sentences with Link Grammar and Python

We built a patio!

The Ladybug and I made a brick paver patio. Four inches of gravel, half inch of sand, ~1500 bricks, 14 consecutive days.

Here’s the quick version:

Here’s the slow version:

Edge-Lit Christmas Cards

I recently saw this project from Evil Mad Scientist Labs and decided to give it a go this year.  I won’t go into all the details in this post, so if you’re curious check out the original.

If you’re going to try it, here’s a word of advice: make sure you get the proper materials. At first I tried to make them with some stuff I had laying around, some smaller batteries which I doubled up and various LEDs which all ended up not being bright enough. After a few failed attempts I ordered some red, green, and blue LEDs from Super Bright LEDs and got the CR2032 coin cell battery they recommended.

I also went through a couple of variations on the design. Initially I was taping the acrylic to a piece of card stock as in the original project but I found that was too flimsy for my purposes. I ended up using some “artboards” I got from Jerry’s which is just down the street from me. These worked out well as they were about as thick as the batteries and acrylic and allowed me to create a pretty sturdy frame for the whole thing.  The other difference this introduced is that the light around the edges was not blocked out and ended up creating a kinda cool glow around the frame of the card (at least while the batteries were still high on juice).

I then used some card-stock folded in threes with a window cut out.

I made a number of different designs and a certain Loony Ladybug helped out with some embellishments.  Note: these photos are a little bleedy, in reality there isn’t nearly so much glow around the bottom of the cards.

I made five cards in total with a few partially completed ones as well.  It was a fun project and I’m definitely planning on doing it again next year.  Now that I’ve got the basic design down I’ll be ready for some improvements.  I played around with some switching mechanisms to extend battery life but didn’t come up with anything I liked.  I think a current limiting resistor would be a good idea as well.  I did do one multi-layer one but I felt like it was too bulky.  I’d like to experiment with some multi-layer designs next time.

Arduino + LEDs + Ultrasonic Sensor + Pumpkin = Dude-o-lantern

A while back my lovely wife got me this Big Lebowski keychain.  I’m not sure why but when I started thinking about what kind of pumpkin to carve this year the keychain popped into my mind.  I took it apart to see how easy it would be to mod and I was pleased to see its internals were quite simple.

DSC_0062

I had picked up a PING))) ultrasonic sensor recently so I wired up the Lebowski chip and the PING))) to my recently constructed Boarduino to get a motion detecting Dude with 6 sayings.

DSC_0063

I figured if I was going to use this for a jack-o-latern I needed a visual component so I wired up four RGB LEDs and diffused them with a pingpong ball (the soldering for this nearly killed me).  Initially I was going to figure out how long each sound clip played so I could do a visual display that would coordinate with the audio but then I realized I could just use one of the analog i/o pins on the Arduino to trigger the lights when the sounds were playing.  This created a nice visual effect that went well with the audio.

Here’s what happened.

Participation

I have been lurking far too long and so I’ve decided to finally participate.  Initially I started creating this blog using Django Basic Apps and while I love how simple and easy-to-implement the  basic blog app is I couldn’t resist the siren song of WordPress so here I am.   The header picture was taken by my friend Katy.

I intend to write primarily about my work and my various projects.  These days I’m spending most of my time working with Python and Django.  Of all the programming languages I’ve worked with Python is by far my favorite and I’m sure there will be plenty of evangelizing about it.

I’ve also been getting into working with the Arduino.  Arduino, if you’re unfamiliar with it is a wonderful open source electronics prototyping platform that uses the Wiring programming language.

Here’s hopin’ I keep up with it!


Twitter-feed